Class Times: Monday and Wednesday 10:30-12:00 Units: 3-0-9 H,G Location: 46-5193 Instructors: Tomaso Poggio (TP), Ryan Rifkin (RR), Jake Bouvrie (JB), Lorenzo Rosasco (LR)
Office Hours: By appointment Email Contact : 9.520@mit.edu Previous Class: SPRING 07 Course description
Focuses on the problem of supervised and unsupervised learning from the perspective of modern statistical learning theory, starting with the theory of multivariate function approximation from sparse data. Develops basic tools such as regularization, including support vector machines for regression and classification. Derives generalization bounds using stability. Discusses current research topics such as manifold regularization, sparsity, feature selection, bayesian connections and techniques, and online learning. Emphasizes applications in several areas: computer vision, speech recognition, and bioinformatics. Discusses advances in the neuroscience of the cortex and their impact on learning theory and applications. The course is graded on the basis of final projects and hands-on applications and exercises.Prerequisites
6.867 or permission of instructor. In practice, a substantial level of mathematical maturity is necessary. Familiarity with probability and functional analysis will be very helpful. We try to keep the mathematical prerequisites to a minimum, but we will introduce complicated material at a fast pace.Grading
There will be two problem sets, a Matlab assignment, and a final project. To receive credit, you must attend regularly, and put in effort on all problem sets and the project.
Problem sets
Problem set #1: PDF -- Due Wed. March 12
Problem set #2: PDF -- Due Monday April 14 (in class)
Projects
Project ideas: PDF
Syllabus
Follow the link for each class to find a detailed description, suggested readings, and class slides. Some of the later classes may be subject to reordering or rescheduling.
Date Title Instructor(s) Class 01 Wed 06 Feb The Course at a Glance TP Class 02 Mon 11 Feb The Learning Problem and Regularization TP Class 03 Wed 13 Feb Reproducing Kernel Hilbert Spaces LR Mon 18 Feb - President's Day Class 04 Tue 19 Feb Regularized Least Squares RR Class 05 Wed 20 Feb Several Views Of Support Vector Machines RR Class 06 Mon 25 Feb Multiclass Classification RR Class 07 Wed 27 Feb Spectral Regularization LR Class 08 Mon 03 Mar Iterative Optimization Techniques Ross Lippert Class 09 Wed 05 Mar Online Learning Sasha Rakhlin Class 10 Mon 10 Mar Generalization Bounds, Intro to Stability Sasha Rakhlin Class 11 Wed 12 Mar Stability of Tikhonov Regularization Sasha Rakhlin Class 12 Mon 17 Mar Sparsity Based Regularization LR Class 13 Wed 19 Mar Loose ends, Project discussions SPRING BREAK Class 14 Mon 31 Mar Manifold Regularization LR Class 15 Wed 02 Apr Bayesian Methods Sayan Mukherjee Class 16 Mon 07 Apr Topics in Approximation Theory Ben Recht Class 17 Wed 09 Apr Vision and Visual Neuroscience TP Class 18 Mon 14 Apr Vision and Visual Neuroscience Thomas Serre Class 19 Wed 16 Apr Deep Belief Networks Geoff Hinton Mon 21 Apr - Patriot's Day Class 20 Wed 23 Apr Derived Distance JB Class 21 Mon 28 Apr Hierarchical Regression Federico Girosi Class 22 Wed 30 Apr Morphable Models for Video Tony Ezzat Class 23 Mon 05 May Manifold Learning I Partha Niyogi Class 24 Wed 07 May Manifold Learning II Partha Niyogi Class 25 Mon 12 May Project Presentations Class 26 Wed 14 May Project Presentations
Math Camp 1 Mon 11 Feb
5:00pm-7:00pmFunctional analysis Math Camp 2 XX Probability theory Reading List
There is no textbook for this course. All the required information will be presented in the slides associated with each class. The books/papers listed below are useful general reference reading, especially from the theoretical viewpoint. A list of suggested readings will also be provided separately for each class.Primary References
- Bousquet, O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning Theory. Advanced Lectures on Machine Learning Lecture Notes in Artificial Intelligence 3176, 169-207. (Eds.) Bousquet, O., U. von Luxburg and G. Ratsch, Springer, Heidelberg, Germany (2004)
- F. Cucker and S. Smale. On The Mathematical Foundations of Learning. Bulletin of the American Mathematical Society, 2002.
- L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1997.
- T. Evgeniou and M. Pontil and T. Poggio. Regularization Networks and Support Vector Machines. Advances in Computational Mathematics, 2000.
- T. Poggio and S. Smale. The Mathematics of Learning: Dealing with Data. Notices of the AMS, 2003
- V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
- V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.
Secondary References
- O. Bousquet and A. Elisseeff, Stability and Generalization, Journal of Machine Learning Research, Vol. 2, pp.499-526, 2002.
- N. Cristianini and J. Shawe-Taylor. Introduction To Support Vector Machines. Cambridge, 2000.
- Lo Gerfo L., Rosasco L., Odone F., De Vito E. and Verri, A. Spectral Algorithms for Supervised Learning, to appear in Neural Computation
- Poggio, T., R. Rifkin, S. Mukherjee and P. Niyogi. General Conditions for Predictivity in Learning Theory, Nature, Vol. 428, 419-422, 2004 (see also Past Performance and Future Results).
- Rifkin, R.,. and R.A. Lippert. Notes on Regularized Least-Squares, CBCL Paper #268/AI Technical Report #2007-019, Massachusetts Institute of Technology, Cambridge, MA, May, 2007.
- Rifkin, R. and A. Klautau. In Defense of One-vs-All Classification, Journal of Machine Learning Research, Vol. 5, 101-141, 2004.
Background Mathematics References
- A. N. Kolmogorov and S. V. Fomin, Introductory Real Analysis, Dover Publications, 1975.
- A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Functions and Functional Analysis, Dover Publications, 1999.
- Luenberger, Optimization by Vector Space Methods, Wiley, 1969.