Prerequisites
Permission of instructor is required. Helpful courses (ideal but not required): Theory of Probability (18.175) and either Statistical Learning Theory and Applications (9.520) or Machine Learning (6.867)
Description
The main goal of this course is to study the generalization ability of a number of popular machine learning algorithms such as boosting, support vector machines and neural networks. We will develop a number of technical tools that will allow us to give qualitative explanations of why these learning algorithms work so well in many classification problems.
Topics of the course include Vapnik-Chervonenkis theory, concentration inequalities in product spaces, and other elements of empirical process theory.
Grading
The grade is based upon two problem sets and class attendance.
Course Outline
Introduction
-
Classification Problem Set-up
-
Examples of Learning Algorithms: Voting Algorithms (Boosting), Support Vector Machines, Neural Networks
-
Analyzing Generalization Ability
Technical Tools: Elements of Empirical Process Theory
One-dimensional Concentration Inequalities
-
Chebyshev (Markov), Rademacher, Hoeffding, Bernstein, Bennett
-
Toward Uniform Bounds: Union Bound, Clustering
Vapnik-Chervonenkis Theory and More
-
VC Classes of Sets and Functions
-
Shattering Numbers, Growth Function, Covering Numbers
-
Examples of VC Classes, Properties
-
Uniform Deviation Bounds
-
Symmetrization
-
Kolmogorov's Chaining Technique
-
Dudley's Entropy Integral
-
Contraction Principles
Concentration Inequalities
-
Talagrand's Concentration Inequality on the Cube
-
Symmetrization
-
Talagrand's Concentration Inequality for Empirical Processes
-
Vapnik-Chervonenkis Type Inequalities
-
Martingale-difference Inequalities
Applications