Diversity-inducing probability measures for machine learning

Li, Chengtao,Ph. D.Massachusetts Institute of Technology.

Author(s)

Li, Chengtao,Ph. D.Massachusetts Institute of Technology.

Download1102048939-MIT.pdf (14.74Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Suvrit Sra and Stefanie Jegelka.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Subset selection problems arise in machine learning within kernel approximation, experimental design, and numerous other applications. In such applications, one often seeks to select diverse subsets of items to represent the population. One way to select such diverse subsets is to sample according to Diversity-Inducing Probability Measures (DIPMs) that assign higher probabilities to more diverse subsets. DIPMs underlie several recent breakthroughs in mathematics and theoretical computer science, but their power has not yet been explored for machine learning. In this thesis, we investigate DIPMs, their mathematical properties, sampling algorithms, and applications. Perhaps the best known instance of a DIPM is a Determinantal Point Process (DPP). DPPs originally arose in quantum physics, and are known to have deep relations to linear algebra, combinatorics, and geometry. We explore applications of DPPs to kernel matrix approximation and kernel ridge regression.

In these applications, DPPs deliver strong approximation guarantees and obtain superior performance compared to existing methods. We further develop an MCMC sampling algorithm accelerated by Gauss-type quadratures for DPPs. The algorithm runs several orders of magnitude faster than the existing ones. DPPs lie in a larger class of DIPMs called Strongly Rayleigh (SR) Measures. Instances of SR measures display a strong negative dependence property known as negative association, and as such can be used to model subset diversity. We study mathematical properties of SR measures, and construct the first provably fast-mixing Markov chain that samples from general SR measures. As a special case, we consider an SR measure called Dual Volume Sampling (DVS), for which we present the first poly-time sampling algorithm.

While all considered distributions over subsets are unconstrained, those of interest in the real world usually come with constraints due to prior knowledge, resource limitations or personal preferences. Hence we investigate sampling from constrained versions of DIPMs. Specifically, we consider DIPMs with cardinality constraints and matroid base constraints and construct poly-time approximate sampling algorithms for them. Such sampling algorithms will enable practical uses of constrained DIPMs in real world.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 163-176).

Date issued

2019

URI

https://hdl.handle.net/1721.1/121724

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses