Machine Learning University of Tartu, Spring 2008 Course Description: Machine learning is concerned with the development of efficient learning algorithms that perform well on novel data. With the masses of data available in today's world, the implementation of such algorithms together with a rigorous statistical validation of those methods are essential. This course covers the development of such algorithms (basic optimization theory, support vector machines, classification and regression, training and testing data, data preprocessing, probabilistic methods), the validation of them (cross-validation, statistical evaluative methods and ways of comparing algorithms) and real-world applications of these methods (for example, clustering methods in Bioinformatics, web mining, character and voice recognition ...etc). At the end of the course the student will be familiar with the various subfields of machine learning, will be able to choose objectively those methods suited for particular datasets, and will independently perform a project related to machine learning. There will also be literature assigned for the course for which reading presentations will be prepared by the students. Finally, there will be some homeworks and practical assignments designed to help in the understanding of the material. Schedule Lectures + Reading Presentation: Tuesdays 12pm, Liivi 2-315 Lectures + Reading Presentation: Wednesdays 4pm, Liivi 2-402 Help Session: Thursdays, Liivi 2-402 on the following dates only: Feb 14, Feb 28, Mar 13, Mar 27, Apr 24, May 8, May 22 Grades The grade categories are as follows: A 91-100% B C D F 81-90% 71-80% 61-70% <60% (failed) The grade for the course will be calculated as follows: 30% 20% 25% 30% Two Exams (mid-term and final). Reading presentations Homework Project Details on the course components Two Exams. Mid Term Exam (kontrolltöö), April 1. Final Exam: June3. Please note: The exams will only contain questions relevant to material covered in class and in the reading presentations. The final exam will not be cumulative and will only concern material discussed from midApril onwards. 20% Reading presentations 10% - Reading write-up 5%+5% - Reading presentation You should read the at least one relevant reading prior to the lecture. I would really like to see active participation during class discussions. Most of the readings are available online and if you have any difficulties accessing them, please let me know, I will help you. You will be assigned one reading – I will try as much as possible to match up the reading with your interests and project plan, but this is not always possible. On the day of your presentation, you must turn in a one page summary describing the principle idea of the paper, the major contributions of the paper, and thoughts on possible limitations of the work and/or how the work could be extended; this will contribute towards 10% of your reading presentation grade. For that same reading, you will have a 20-30 minute time slot during a class period to present the reading. Note that I will not be the only one reviewing your reading presentation. Your peers will also be active participants in judging your presentation skills, so try to deliver the material effectively. Length is not important. You should be concerned with the organization, your means of delivery, your overall knowledge of the material and your overall presentation; above all, your talk should induce discussion and interest. Remember, I am available during the Wednesday period for help and guidance with this. Finally, I will also request a copy of your presentation slides so that we can post it up on the course website for easy reference. 25% Homework These will be assigned as necessary and will contain questions about the reading material and a few problems to solve. 30% Project 15% - Write-up 5% (2.5+2.5) – Two project-description deadlines 10% (5+5) - Presentation You will be required to complete an original project related to Machine Learning. The project is negotiable but these are the two main possibilities: - Practical work: The project may be related to your current research or other work, where you implement some novel machine learning method or where you use an existing machine learning method for your work. A write up of 3-5 pages (not more, not less) will be due where you describe the problem, your implementation, your results and your conclusions. Note that your grade will be based on the quality of your research, not on the results obtained. - Research topic: You pick a slightly broad topic in machine learning; it could be one related to material we discussed in class, or something new. (For example, if you are interested in Bioinformatics, you might research Clustering methods in Bioinformatics, or Sequence Algorithms in Bioinformatics, or both if you wish to be broader.) You will research that topic thoroughly digging up all papers and book chapters that pertain to the topic, and you will prepare a write up of 8-10 pages (not more, not less). Your write up will include a description of the research for that topic, or the applications, and it should also include your own thoughts and ideas on the shortcomings of current research in your chosen topic, future goals and other ideas you may have on extending or implementing current tools. Project write-ups will be due on 27th May 2008. I will be available for guidance on the project – to provide ideas and to help you with any implementations. To make sure you are on target, there are two project deadlines to keep in mind which will contribute towards 5% of your project grade. These are: March 18th 2008 – Project Proposal (one-page description) April 29th 2008 – Project Progress (one-page description) The last week of May (May 27th – May 29th) will be dedicated towards your project presentations. Just like your reading presentation, your peers will also help me in grading your project presentation performance. Major Calendar Deadlines Project Proposal – Mar 18 (2.5% of project grade) ***EXAM 1 – April 8 *** Project Progress report – Apr 29 (2.5% of project grade) Final Project Paper - May 27 ** Project Presentations – May 27, 28** *** EXAM 2 – June 3 *** Suggested Textbooks - These are available on campus. Kernel Methods for Pattern Analysis, Nello Cristianini and John Shawe Taylor Pattern Recognition and Machine Learning, Christopher M. Bishop Information Theory, Inference and Learning Algorithms, David J. C. MacKay Learning with kernels, Bernard Scholkopf and Alexander J. Smola Course outline – First half (Feb to mid April 2008) Below are the topics we will be working on with the relevant literature indicated. You are not expected to read them all! My intention is to expose you to multiple authors in the field and to give you the opportunity to choose one reading over another. Those marked with * are highly recommendable for the course. Readings marked with ** are to be presented by you in class. Introduction to Machine Learning: basic terms and some applications (Feb 12) The Discipline of Machine Learning , Tom Mitchell http://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf Machine Learning, Nature Encyclopedia of Cognitive Science, Thomas D. Dietterich* http://web.engr.oregonstate.edu/~tgd/projects/tutorials.html Introduction to Machine Learning , Nils J. Nilsson http://robotics.stanford.edu/people/nilsson/MLDraftBook/ch1-ml.pdf Kernel Methods for Pattern Analysis Chapter 1 Machine Learning and Pattern Recognition (slides) Yann LeCun http://cs.nyu.edu/~yann/2007f-G22-2565-001/diglib/lecture03-regularization.pdf Optimization Theory (Feb 13, 19) Learning with Kernels Chapter 6 Practical Optimization: A Gentle Introduction – Chapter 1 * John W. Chinneck, Systems and Computer Engineering, Carleton University Ottawa, Ontario K1S 5B6, Canada http://www.sce.carleton.ca/faculty/chinneck/po/Chapter1.pdf Introduction to Optimization Methods: a Brief Survey of Methods ** Joa˜o S. D. Garcia, Se´rgio L. A´ vila, and Walter P. Carpes http://www.ewh.ieee.org/soc/e/sac/meem/vol01iss02/MEEM_opt.pdf The Interplay of Optimization and Machine Learning Research Kristin Bennett, Emilio Parrado-Hernandez http://jmlr.csail.mit.edu/papers/volume7/MLOPT-intro06a/MLOPT-intro06a.pdf Kernel Methods (Feb 20, 26) An Introduction to Kernel-Based Learning Algorithms Klaus-Robert Muller, Sebastian Mika, Gnnar Ratsch, Koji Tsuda, Bernhard Scholkopf Kernel Methods for Pattern Analysis Part I deals with the theory behind Kernels Part III is a plethora of kernel types Learning with Kernels Chapter 2 Fast String Kernels using Inexact Matching for Protein Sequences ** Christina Leslie and Rui Kuang http://jmlr.csail.mit.edu/papers/volume5/leslie04a/leslie04a.pdf Support Vector Machines (Feb 27) Support Vector Machines: Hype or Hallelujah Kristin Bennett, Colin Campbell http://www.sigkdd.org/explorations/issue2-2/bennett.pdf Statistical learning and kernel methods Bernhard Scholkopf ftp://ftp.research.microsoft.com/pub/tr/tr-2000-23.pdf Support Vector Machines – an Introduction Ron Meir http://www.ee.technion.ac.il/~rmeir/SVMReview.pdf A Tutorial on Support Vector Machines for Pattern Recognition * Christopher J. C. Burges http://www.umiacs.umd.edu/~joseph/support-vector-machines4.pdf *** No class on March 4th *** Faster SVMs and SVM applications (Mar 5) [Konstantin] SVM-light: Making Large-Scale SVM Learning Practical Thorsten Joachims http://www.cs.cornell.edu/People/tj/publications/joachims_99a.pdf Training SVMs in linear time Thorsten Joachims http://www.cs.cornell.edu/People/tj/publications/joachims_06a.pdf SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition** Iain Melvin1, Eugene Ie, Rui Kuang, Jason Weston, William Noble Stafford and Christina Leslie http://www.biomedcentral.com/content/pdf/1471-2105-8-S4-S2.pdf Support Vector Regression (Mar 11) A tutorial on Support Vector Regression Regression * Alex J. Smola, Bernard Scholkopf Duality, Geometry and Support Vector Regression Jinbo Bi and Kristin Bennett http://www.cs.rpi.edu/~bij2/rec.html Statistical Analysis of Semi-Supervised Regression ** John Lafferty, Larry Wasserman http://books.nips.cc/papers/files/nips20/NIPS2007_0293.pdf Compressed Regression Shuheng Zhou, John Lafferty, Larry Wasserman http://books.nips.cc/papers/files/nips20/NIPS2007_0195.pdf Ranking (Mar 12) Optimizing Search Engines using Clickthrough Data ** Thorsten Joachims http://www.cs.cornell.edu/people/tj/publications/joachims_02c.pdf Pranking with Ranking Koby Crammer, Yoram Singer (NIPS 2001) Learning to Order Things William W. Cohen, Robert E. Shapire, Yoram Singer http://people.csail.mit.edu/jrennie/papers/other/cohen-order-98.pdf The Netflix challenge, http://www.netflixprize.com/ New SVMs (Mar 18) New Support Vector Algorithms * B. Scholkopf, A. Smola, R. Williamson, P. Bartlett http://www.stat.purdue.edu/~yuzhu/stat598m3/Papers/NewSVM.pdf ******** Mar 18: Project Proposal due ******** Evaluative methods (Mar 19) On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach ** S. Salzberg Crafting Papers in Machine Learning * Pat Langley http://www-csli.stanford.edu/icml2k/craft.html Regression Error Characteristic Curves Jinbo Bi and Kristin Bennett http://www.cs.rpi.edu/~bij2/doc/RECcurve.pdf Data Mining in Metric Space: An Empirical Analysis of Supervised Learning Performance Criteria Rich Caruana and Alexandru Niculescu-Mizil http://www.cs.cornell.edu/~caruana/perfs.kdd04.revised.rev1.pdf Boosting (Mar 25) AdaBoost Jan Sochman, Jiri Matas http://cmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf A short introduction to boosting * Y. Freund, R. Schapire An efficient boosting algorithm for combining preferences ** Y. Freund, R. Iyer, R. Schapire and Y. Singer Principle Component Analysis and Data Visualization (Mar 26) Kernel Methods for Pattern Analysis Ch 6, Section 6.2 A tutorial on Principal Component Analysis * Lindsay I. Smith http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf The Effect of Principal Component Analysis on Machine Learning Accuracy with High Dimensional Spectral Data ** Tom Howley, Michael G. Madden, Marie-Louise O’Connell and Alan G. Ryder http://www.it.nuigalway.ie/m_madden/profile/pubs/kbs-2006b.pdf Clustering (Apr 1) Towards a statistical theory of clustering Ulrike von Luxburg and Shai Ben-David http://www.cs.uwaterloo.ca/~shai/LuxburgBendavid05.pdf Support Vector Clustering Asa Ben-Hur, David Horn, Hava T. Siegelmann, Vladimir Vapnik http://jmlr.csail.mit.edu/papers/volume2/horn01a/rev1/horn01ar1.pdf A sober look at clustering stability ** Shai Ben-David, Ulrike von Luxburg, and D´avid P´al http://www.cs.uwaterloo.ca/~shai/sober.pdf Spectral clustering and evaluating clusterings + review for exam (Apr 2) Comparing Clusterings Marina Meila. Learning spectral clustering Francis R. Bach, Michael I. Jordan. http://cmm.ensmp.fr/~bach/nips03_cluster.pdf A Comparison of Spectral Clustering Algorithms * Deepak Varma and Marina Meila. Functional Grouping of Genes Using Spectral Clustering and Gene Ontology ** Nora Speer, Holger Fröhlich, Christian Spieth and Andreas Zell http://www.dkfz.de/mga2/gosim/GOFeatureMapsIJCNN05_submitted.pdf ******** Exam 1: April 8 ******** *** No class on April 9, 10 *** Planned Calendar, Second half (April 15th to May 2008) Manifold Learning (Apr 15) Algorithms for manifold learning * Lawrence Cayton http://vis.lbl.gov/~romano/mlgroup/papers/manifold-learning.pdf K nearest Neighbors and Novelty Detection (Apr 16) K-Nearest-Neighbor Consistency in Data Clustering: Incorporating Local Information into Global Optimization Chris Ding and Xiaofeng He http://delivery.acm.org/10.1145/970000/968021/p584ding.pdf?key1=968021&key2=1805861021&coll=GUIDE&dl=GUIDE&CFID=13294395&CFTOKE N=62602314 Large Margin Nearest Neighbor Classifiers Carlotta Domeniconi, Dimitrios Gunopulos, and Jing Peng http://ieeexplore.ieee.org/iel5/72/31443/01461432.pdf A Linear Programming Approach to Novelty Detection ** Colin Campbell and Kristin Bennett Active Learning (Apr 22) Fast Kernel Classifiers with Online and Active Learning Antoine Bordes, Seyda Ertekin, Jason Weston, Leon Bottou http://jmlr.csail.mit.edu/papers/volume6/bordes05a/bordes05a.pdf Summary of current work on Active Learning Rong Jin http://www.cse.msu.edu/~rongjin/semisupervised/sum-activelearning.pdf Query Learning with Large Margin Classifiers Colin Campbell, Nello Cristianini, Alex J. Smola url???still investigating Active Learning in the Drug Discovery Process ** Manfred K.Warmuth, Gunnar R¨atsch, Michael Mathieson, Jun Liao, Christian Lemmen http://www.soe.ucsc.edu/~manfred/pubs/C60.pdf Probabilistic Methods in Machine Learning (April 23, 29) Information Theory, Inference and Learning Algorithms David MacKay, Chapters 2 and 3 Learning with Kernels Chapter 6 The Latent Process Decomposition of cDNA Microarray Data Sets ** Simon Rogers, Mark Girolami, Colin Campbell, and Rainer Breitling http://delivery.acm.org/10.1145/1080000/1070680/n0143.pdf?key1=1070680&key2=995484 2021&coll=GUIDE&dl=GUIDE&CFID=54016052&CFTOKEN=43348973 ******** April 29: Project Progress report due ******** Kernel Fisher (May 6) Learning with Kernels Chapter 15 Asymptotic properties of the Fisher kernel Koji Tsuda, Shotaro Akaho, Motoaki Kawanabe and Klaus-Robert M¨uller http://www2.informatik.hu-berlin.de/Forschung_Lehre/wm/journalclub/pdf2268.pdf Data fusion (May 7) Kernel-based data fusion and its application to protein function prediction in yeast ** G.R.G. Lanckriet, M. Deng, N. Cristianini, M.I. Jordan, W.S. Noble http://noble.gs.washington.edu/papers/lanckriet_kernel.pdf Kernel-based data fusion for gene prioritization ** Tijl De Bie, Leon-Charles Tranchevent, Liesbeth M. M. van Oeffelen and Yves Moreau http://bioinformatics.oxfordjournals.org/cgi/content/full/23/13/i125 Data Integration for Classification problems Employing Gaussian Process Priors ** Mark Girolami and Mingjun Zhong http://books.nips.cc/papers/files/nips19/NIPS2006_0206.pdf Multi-Task Learning (May 13) Learning Multiple Tasks with Kernel Methods Theodoros Evgeniou, Charles Micchelli, Massimiliano Pontil http://www.cs.berkeley.edu/~russell/classes/cs294/f05/papers/evgeniou+al-2005.pdf Regularized Multi-Task Learning Theodorus Evgeniou and Massimiliano Pontil http://www.cs.ucl.ac.uk/staff/M.Pontil/reading/mt-kdd.pdf Biological applications (May 14) Feature Selection Methods for Improving Protein Structure Prediction with Rosetta ** Ben Blum, Michael Jordan, David Kim, Rhiju Das, Philip Bradley, David Baker http://books.nips.cc/papers/files/nips20/NIPS2007_1055.pdf Typing Staphylococcus aureus Using the spa Gene and Novel Distance Measures P. Agius, B. Kreiswirth, S. Naidich, K. Bennett Protein network inference from multiple genomic data: a supervised approach ** Y. Yamanishi, JP Vert, M Kanehisa http://bioinformatics.oxfordjournals.org/cgi/reprint/20/suppl_1/i363 Other applications (May 20) Finding Language-Independent Semantic Representation of Text Using Kernel Canonical Correlation Analysis ** Alexei Vinokourov, John Shawe-Taylor, Nello Cristianini Video Deconstruction: Recovering Scene Structures in Movies ** Timothee Cour, Ben Taskar (paper currently under review, link to be provided later) Concluding remarks, brainstorming on the future of ML, review for exam (May 21) *** Final Project Paper due – May 27 *** *** Project Presentations - May 27, 28 *** ******** Exam 2: June 3 ********
© Copyright 2026 Paperzz