9.520: Statistical Learning Theory and Applications, Spring 2008

Class Times: Monday and Wednesday 10:30-12:00

Units: 3-0-9 H,G

Location: 46-5193

Instructors: Tomaso Poggio (TP), Ryan Rifkin (RR), Jake Bouvrie (JB), Lorenzo Rosasco (LR)

Office Hours: By appointment

Email Contact : 9.520@mit.edu

Previous Class: SPRING 07

Course Description

Prerequisites

Grading

Problem Sets

Projects

Syllabus

Reading List

Course description
Focuses on the problem of supervised and unsupervised learning from the perspective of modern statistical learning theory, starting with the theory of multivariate function approximation from sparse data. Develops basic tools such as regularization, including support vector machines for regression and classification. Derives generalization bounds using stability. Discusses current research topics such as manifold regularization, sparsity, feature selection, bayesian connections and techniques, and online learning. Emphasizes applications in several areas: computer vision, speech recognition, and bioinformatics. Discusses advances in the neuroscience of the cortex and their impact on learning theory and applications. The course is graded on the basis of final projects and hands-on applications and exercises.
Prerequisites
6.867 or permission of instructor. In practice, a substantial level of mathematical maturity is necessary. Familiarity with probability and functional analysis will be very helpful. We try to keep the mathematical prerequisites to a minimum, but we will introduce complicated material at a fast pace.
Grading
There will be two problem sets, a Matlab assignment, and a final project. To receive credit, you must attend regularly, and put in effort on all problem sets and the project.

Problem sets
Problem set #1: PDF -- Due Wed. March 12
Problem set #2: PDF -- Due Monday April 14 (in class)

Projects
Project ideas: PDF

Syllabus
Follow the link for each class to find a detailed description, suggested readings, and class slides. Some of the later classes may be subject to reordering or rescheduling.

Date Title Instructor(s)

Class 01 Wed 06 Feb The Course at a Glance TP

Class 02 Mon 11 Feb The Learning Problem and Regularization TP

Class 03 Wed 13 Feb Reproducing Kernel Hilbert Spaces LR

Mon 18 Feb - President's Day

Class 04 Tue 19 Feb Regularized Least Squares RR

Class 05 Wed 20 Feb Several Views Of Support Vector Machines RR

Class 06 Mon 25 Feb Multiclass Classification RR

Class 07 Wed 27 Feb Spectral Regularization LR

Class 08 Mon 03 Mar Iterative Optimization Techniques Ross Lippert

Class 09 Wed 05 Mar Online Learning Sasha Rakhlin

Class 10 Mon 10 Mar Generalization Bounds, Intro to Stability Sasha Rakhlin

Class 11 Wed 12 Mar Stability of Tikhonov Regularization Sasha Rakhlin

Class 12 Mon 17 Mar Sparsity Based Regularization LR

Class 13 Wed 19 Mar Loose ends, Project discussions

SPRING BREAK

Class 14 Mon 31 Mar Manifold Regularization LR

Class 15 Wed 02 Apr Bayesian Methods Sayan Mukherjee

Class 16 Mon 07 Apr Topics in Approximation Theory Ben Recht

Class 17 Wed 09 Apr Vision and Visual Neuroscience TP

Class 18 Mon 14 Apr Vision and Visual Neuroscience Thomas Serre

Class 19 Wed 16 Apr Deep Belief Networks Geoff Hinton

Mon 21 Apr - Patriot's Day

Class 20 Wed 23 Apr Derived Distance JB

Class 21 Mon 28 Apr Hierarchical Regression Federico Girosi

Class 22 Wed 30 Apr Morphable Models for Video Tony Ezzat

Class 23 Mon 05 May Manifold Learning I Partha Niyogi

Class 24 Wed 07 May Manifold Learning II Partha Niyogi

Class 25 Mon 12 May Project Presentations

Class 26 Wed 14 May Project Presentations

Math Camp 1 Mon 11 Feb
5:00pm-7:00pm Functional analysis

Math Camp 2 XX Probability theory

Reading List
There is no textbook for this course. All the required information will be presented in the slides associated with each class. The books/papers listed below are useful general reference reading, especially from the theoretical viewpoint. A list of suggested readings will also be provided separately for each class.
Primary References

Bousquet, O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning Theory. Advanced Lectures on Machine Learning Lecture Notes in Artificial Intelligence 3176, 169-207. (Eds.) Bousquet, O., U. von Luxburg and G. Ratsch, Springer, Heidelberg, Germany (2004)

F. Cucker and S. Smale. On The Mathematical Foundations of Learning. Bulletin of the American Mathematical Society, 2002.

L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1997.

T. Evgeniou and M. Pontil and T. Poggio. Regularization Networks and Support Vector Machines. Advances in Computational Mathematics, 2000.

T. Poggio and S. Smale. The Mathematics of Learning: Dealing with Data. Notices of the AMS, 2003

V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.

V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.

Secondary References

O. Bousquet and A. Elisseeff, Stability and Generalization, Journal of Machine Learning Research, Vol. 2, pp.499-526, 2002.

N. Cristianini and J. Shawe-Taylor. Introduction To Support Vector Machines. Cambridge, 2000.

Lo Gerfo L., Rosasco L., Odone F., De Vito E. and Verri, A. Spectral Algorithms for Supervised Learning, to appear in Neural Computation

Poggio, T., R. Rifkin, S. Mukherjee and P. Niyogi. General Conditions for Predictivity in Learning Theory, Nature, Vol. 428, 419-422, 2004 (see also Past Performance and Future Results).
Rifkin, R.,. and R.A. Lippert. Notes on Regularized Least-Squares, CBCL Paper #268/AI Technical Report #2007-019, Massachusetts Institute of Technology, Cambridge, MA, May, 2007.

Rifkin, R. and A. Klautau. In Defense of One-vs-All Classification, Journal of Machine Learning Research, Vol. 5, 101-141, 2004.

Background Mathematics References

A. N. Kolmogorov and S. V. Fomin, Introductory Real Analysis, Dover Publications, 1975.

A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Functions and Functional Analysis, Dover Publications, 1999.

Luenberger, Optimization by Vector Space Methods, Wiley, 1969.

Class Times:	Monday and Wednesday 10:30-12:00
Units:	3-0-9 H,G
Location:	46-5193
Instructors:	Tomaso Poggio (TP), Ryan Rifkin (RR), Jake Bouvrie (JB), Lorenzo Rosasco (LR)
Office Hours:	By appointment
Email Contact :	9.520@mit.edu
Previous Class:	SPRING 07

	Date	Title	Instructor(s)
Class 01	Wed 06 Feb	The Course at a Glance	TP
Class 02	Mon 11 Feb	The Learning Problem and Regularization	TP
Class 03	Wed 13 Feb	Reproducing Kernel Hilbert Spaces	LR
Mon 18 Feb - President's Day
Class 04	Tue 19 Feb	Regularized Least Squares	RR
Class 05	Wed 20 Feb	Several Views Of Support Vector Machines	RR
Class 06	Mon 25 Feb	Multiclass Classification	RR
Class 07	Wed 27 Feb	Spectral Regularization	LR
Class 08	Mon 03 Mar	Iterative Optimization Techniques	Ross Lippert
Class 09	Wed 05 Mar	Online Learning	Sasha Rakhlin
Class 10	Mon 10 Mar	Generalization Bounds, Intro to Stability	Sasha Rakhlin
Class 11	Wed 12 Mar	Stability of Tikhonov Regularization	Sasha Rakhlin
Class 12	Mon 17 Mar	Sparsity Based Regularization	LR
Class 13	Wed 19 Mar	Loose ends, Project discussions
SPRING BREAK
Class 14	Mon 31 Mar	Manifold Regularization	LR
Class 15	Wed 02 Apr	Bayesian Methods	Sayan Mukherjee
Class 16	Mon 07 Apr	Topics in Approximation Theory	Ben Recht
Class 17	Wed 09 Apr	Vision and Visual Neuroscience	TP
Class 18	Mon 14 Apr	Vision and Visual Neuroscience	Thomas Serre
Class 19	Wed 16 Apr	Deep Belief Networks	Geoff Hinton
Mon 21 Apr - Patriot's Day
Class 20	Wed 23 Apr	Derived Distance	JB
Class 21	Mon 28 Apr	Hierarchical Regression	Federico Girosi
Class 22	Wed 30 Apr	Morphable Models for Video	Tony Ezzat
Class 23	Mon 05 May	Manifold Learning I	Partha Niyogi
Class 24	Wed 07 May	Manifold Learning II	Partha Niyogi
Class 25	Mon 12 May	Project Presentations
Class 26	Wed 14 May	Project Presentations

Math Camp 1	Mon 11 Feb 5:00pm-7:00pm	Functional analysis
Math Camp 2	XX	Probability theory