Efficient sampling for determinantal point processes

Li, Chengtao

Author(s)

Li, Chengtao

DownloadFull printable version (8.160Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Stefanie Jegelka and Suvrit Sra.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Determinantal Point Processes (DPPs) are elegant probabilistic models of repulsion and diversity over discrete sets of items. It assigns higher probability to diverse subsets, making them more possible to be sampled. If we want to fully control the size of sampled subsets, the perfect choice would be k-Dpp, a practical specialization of DPP that only samples size-k subsets. In this thesis, we address efficient sampling algorithms and applications of (k-DPP). First, we propose a new method to approximately sample from k-Dpps. Our method takes advantage of the diversity property of subsets sampled from a DPP. It proceeds in two stages: first, it constructs a small subset called coreset from full dataset that approximates the k-Dpp distribution; then it samples from this coreset-approximated distribution efficiently. This approximate sampling strategy fits the original distribution better than existing methods, and is more efficient when multiple samples from k-DPP are required. Second, we consider accelerating existing Markov chain (k-)DPP under the condition that data kernel matrix is sparse. Concretely, we present a general framework for accelerating algorithms that requires computation of uT A-1u as one of computational subroutines. In our framework, we bound uT A-1u with Gauss-type quadrature efficiently. We study theoretical properties of Gauss-type quadrature and illustrate empirical consequences of our results by accelerating (k-)DPP sampling, where we observe tremendous speedups. Finally, we show how DPP can be applied to core machine learning applications. Due to its diversity/repulsiveness-promoting property, DPP is potentially useful in many applications where good sketching is needed. In our case, we apply DPP to Nyström method and kernel ridge regression. We show theoretical guarantees when using DPP in these methods, and observe superior performance in practice.

Description

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 93-105).

Date issued

2016

URI

http://hdl.handle.net/1721.1/106092

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses