Show simple item record

dc.contributor.advisorStefanie Jegelka and Suvrit Sra.en_US
dc.contributor.authorLi, Chengtao
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2016-12-22T16:28:41Z
dc.date.available2016-12-22T16:28:41Z
dc.date.copyright2016en_US
dc.date.issued2016en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/106092
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 93-105).en_US
dc.description.abstractDeterminantal Point Processes (DPPs) are elegant probabilistic models of repulsion and diversity over discrete sets of items. It assigns higher probability to diverse subsets, making them more possible to be sampled. If we want to fully control the size of sampled subsets, the perfect choice would be k-Dpp, a practical specialization of DPP that only samples size-k subsets. In this thesis, we address efficient sampling algorithms and applications of (k-DPP). First, we propose a new method to approximately sample from k-Dpps. Our method takes advantage of the diversity property of subsets sampled from a DPP. It proceeds in two stages: first, it constructs a small subset called coreset from full dataset that approximates the k-Dpp distribution; then it samples from this coreset-approximated distribution efficiently. This approximate sampling strategy fits the original distribution better than existing methods, and is more efficient when multiple samples from k-DPP are required. Second, we consider accelerating existing Markov chain (k-)DPP under the condition that data kernel matrix is sparse. Concretely, we present a general framework for accelerating algorithms that requires computation of uT A-1u as one of computational subroutines. In our framework, we bound uT A-1u with Gauss-type quadrature efficiently. We study theoretical properties of Gauss-type quadrature and illustrate empirical consequences of our results by accelerating (k-)DPP sampling, where we observe tremendous speedups. Finally, we show how DPP can be applied to core machine learning applications. Due to its diversity/repulsiveness-promoting property, DPP is potentially useful in many applications where good sketching is needed. In our case, we apply DPP to Nyström method and kernel ridge regression. We show theoretical guarantees when using DPP in these methods, and observe superior performance in practice.en_US
dc.description.statementofresponsibilityby Chengtao Li.en_US
dc.format.extent105 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleEfficient sampling for determinantal point processesen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc965382973en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record