Improved Guarantees for Learning GMMs

Liu, Allen

Author(s)

Liu, Allen

DownloadThesis PDF (1.111Mb)

Advisor

Moitra, Ankur

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Mixtures of Gaussians (GMMs) are one of the most commonly used statistical models. They are typically used to model data coming from two or more heterogenous sources and have applications in a wide variety of fields including statistics, biology, physics and computer science. A fundamental task at the core of many of these applications is to learn the parameters of a mixture of Gaussians from samples. Starting with the seminal work of Karl Pearson in 1894 [81], there has been a long line of work on this problem [32, 6, 93, 48, 63, 78, 59, 49, 46, 39, 13, 65]. Despite extensive work, several important questions have remained open for decades. We address two of those here. First, we study the problem of clustering in polynomial time, in terms of both the dimension 𝑑 and number of components 𝑘. While an exponential dependence on 𝑘 is necessary for learning in the worst case, it is possible to achieve a better dependence if one assumes that the components are clusterable. More precisely, for a mixture of 𝑘 isotropic Gaussians in R 𝑑 , as long as the means are separated by Ω(√ log 𝑘), then it is information-theoretically possible to cluster and learn the parameters in polynomial time. Despite recent advances [67, 55, 46], existing polynomial time algorithms all require a larger separation of Ω(𝑘 𝛿 ) for some 𝛿 > 0. In this work, we give an algorithm that has runtime and sample complexity poly(𝑘, 𝑑) and provably works with essentially minimal separation. Second, we seek to address robustness. In particular, real data generally does not come from a distribution that is exactly a mixture of Gaussians, but rather a distribution that is close to a mixture of Gaussians. To address this, we consider a more challenging setting, that is now ubiquitous in the field of robust statistics, where an 𝜖-fraction of the datapoints may be arbitrarily altered, potentially adversarially. There has been a flurry of recent work towards developing robust algorithms for learning mixtures of Gaussians [39, 13, 65], but these results all require restrictions on the class of mixtures considered. In this work, we give an algorithm that attains provable robustness guarantees and works in full generality.

Date issued

2022-09

URI

https://hdl.handle.net/1721.1/147317

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses