Syllabus

Prerequisites

Permission of instructor is required. Helpful courses (ideal but not required): Theory of Probability (18.175) and either Statistical Learning Theory and Applications (9.520) or Machine Learning (6.867)

Description

The main goal of this course is to study the generalization ability of a number of popular machine learning algorithms such as boosting, support vector machines and neural networks. We will develop a number of technical tools that will allow us to give qualitative explanations of why these learning algorithms work so well in many classification problems.

Topics of the course include Vapnik-Chervonenkis theory, concentration inequalities in product spaces, and other elements of empirical process theory.

Grading

The grade is based upon two problem sets and class attendance.

Course Outline

Introduction

  • Classification Problem Set-up
  • Examples of Learning Algorithms: Voting Algorithms (Boosting), Support Vector Machines, Neural Networks
  • Analyzing Generalization Ability

Technical Tools: Elements of Empirical Process Theory

One-dimensional Concentration Inequalities

  • Chebyshev (Markov), Rademacher, Hoeffding, Bernstein, Bennett
  • Toward Uniform Bounds: Union Bound, Clustering

Vapnik-Chervonenkis Theory and More

  • VC Classes of Sets and Functions
  • Shattering Numbers, Growth Function, Covering Numbers
  • Examples of VC Classes, Properties
  • Uniform Deviation Bounds
  • Symmetrization
  • Kolmogorov's Chaining Technique
  • Dudley's Entropy Integral
  • Contraction Principles

Concentration Inequalities

  • Talagrand's Concentration Inequality on the Cube
  • Symmetrization
  • Talagrand's Concentration Inequality for Empirical Processes
  • Vapnik-Chervonenkis Type Inequalities
  • Martingale-difference Inequalities

Applications

  • Generalization Ability of Voting Classifiers, Neural Networks, Support Vector Machines