Coresets for fast Bayesian inference in Dirichlet process mixture models
Author(s)Reddy, Sushrutha P.
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Tamara Broderick and Trevor Campbell.
MetadataShow full item record
Bayesian inference is a powerful and flexible methodology lending itself to a multitude of applications. However, the computation required to perform Bayesian inference can be prohibitive in modern, data-rich settings. A recent line of work introduces coresets for Bayesian inference, which reduce the runtime of performing approximate Bayesian inference using MCMC in many common models, while preserving the fidelity of the output. In this work, we extend the coresets framework to apply to Dirichlet process mixture models, a flexible nonparametric framework allowing one to learn both the number and location of clusters from data. Our main technical innovation is a fast coreset slice sampler for inference in Dirichlet process mixture models, building on the slice sampler detailed in . When coupled with the methods for creating a coreset outlined in [2, 3], this provides a fully automated means of performing fast inference in such models. We then exhibit the empirical performance gains and accuracy of our coreset sampler, relative to that of the full sampler, on synthetic datasets as well as three real-world datasets of interest drawn from astrophysics, computer vision, and natural language processing.
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020Cataloged from student-submitted PDF of thesis.Includes bibliographical references (pages 51-53).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.