CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff [email protected] 1/31/08 CS 461, Winter 2009 1 Plan for Today Solution to HW 2 Support Vector Machines Neural Networks Perceptrons Multilayer Perceptrons 1/31/08 CS 461, Winter 2009 2 Review from Lecture 3 Decision trees Regression trees, pruning, extracting rules Evaluation Comparing two classifiers: McNemar’s test Support Vector Machines Classification Linear discriminants, maximum margin Learning (optimization): gradient descent, QP 1/31/08 CS 461, Winter 2009 3 Neural Networks Chapter 11 It Is Pitch Dark 1/31/08 CS 461, Winter 2009 4 Perceptron Graphical Math d y w j x j w0 wT x j 1 w w 0 , w1 ,..., w d T x 1, x 1 ,..., x d T 1/31/08 CS 461, Winter 2009 [Alpaydin 2004 The MIT Press] 5 “Smooth” Output: Sigmoid Function 1. Calculate gx wT x and choose C1 if gx 0, or 2. Calculate y sigmoid wT x and choose C1 if y 0.5 Why? • Converts output to probability! • Less “brittle” boundary y sigmoid wT x 1/31/08 1 1 exp wT x CS 461, Winter 2009 6 K outputs Regression: d y i w ij x j w i 0 wiT x j 1 y Wx Classification: oi w iT x expoi yi k expok Softmax choose C i if y i max y k k 1/31/08 CS 461, Winter 2009 [Alpaydin 2004 The MIT Press] 7 Training a Neural Network 1. Randomly initialize weights 2. Update = Learning rate * (Desired - Actual) * Input w tj y t yˆ t x tj 1/31/08 CS 461, Winter 2009 8 Learning Boolean AND t t ˆ w y y x j t j t Perceptron demo 1/31/08 CS 461, Winter 2009 [Alpaydin 2004 The MIT Press] 9 Multilayer Perceptrons = MLP = ANN H y i vTi z v ih z h v i0 h1 z h sigmoid wTh x 1 d 1 exp w hj x j w h 0 j1 1/31/08 CS 461, Winter 2009 [Alpaydin 2004 The MIT Press] 10 x1 XOR x2 = (x1 AND ~x2) OR (~x1 AND x2) 1/31/08 CS 461, Winter 2009 [Alpaydin 2004 The MIT Press] 11 Examples Digit Recognition Ball Balancing 1/31/08 CS 461, Winter 2009 12 ANN vs. SVM SVM with sigmoid kernel = 2-layer MLP Parameters ANN: # hidden layers, # nodes SVM: kernel, kernel params, C Optimization ANN: local minimum (gradient descent) SVM: global minimum (QP) Interpretability? About the same… So why SVMs? Sparse solution, geometric interpretation, less likely to overfit data 1/31/08 CS 461, Winter 2009 13 Summary: Key Points for Today Support Vector Machines Neural Networks Perceptrons Sigmoid Training by gradient descent Multilayer Perceptrons ANN vs. SVM 1/31/08 CS 461, Winter 2009 14 Next Time Midterm Exam! 9:10 – 10:40 a.m. Open book, open notes (no computer) Covers all material through today Neural Networks (read Ch. 11.1-11.8) Questions to answer from the reading Posted on the website (calendar) Three volunteers? 1/31/08 CS 461, Winter 2009 15
© Copyright 2026 Paperzz