dc.contributor.advisor | Nancy A. Lynch. | en_US |
dc.contributor.author | Musco, Cameron N. (Cameron Nicholas) | en_US |
dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
dc.date.accessioned | 2016-03-03T20:30:41Z | |
dc.date.available | 2016-03-03T20:30:41Z | |
dc.date.copyright | 2015 | en_US |
dc.date.issued | 2015 | en_US |
dc.identifier.uri | http://hdl.handle.net/1721.1/101473 | |
dc.description | Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. | en_US |
dc.description | This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. | en_US |
dc.description | Cataloged from student-submitted PDF version of thesis. | en_US |
dc.description | Includes bibliographical references (pages 123-131). | en_US |
dc.description.abstract | In this thesis we study dimensionality reduction techniques for approximate k-means clustering. Given a large dataset, we consider how to quickly compress to a smaller dataset (a sketch), such that solving the k-means clustering problem on the sketch will give an approximately optimal solution on the original dataset. First, we provide an exposition of technical results of [CEM+15], which show that provably accurate dimensionality reduction is possible using common techniques such as principal component analysis, random projection, and random sampling. We next present empirical evaluations of dimensionality reduction techniques to supplement our theoretical results. We show that our dimensionality reduction algorithms, along with heuristics based on these algorithms, indeed perform well in practice. Finally, we discuss possible extensions of our work to neurally plausible algorithms for clustering and dimensionality reduction. This thesis is based on joint work with Michael Cohen, Samuel Elder, Nancy Lynch, Christopher Musco, and Madalina Persu. | en_US |
dc.description.statementofresponsibility | by Cameron N. Musco. | en_US |
dc.format.extent | 131 pages | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Massachusetts Institute of Technology | en_US |
dc.rights | M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. | en_US |
dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
dc.subject | Electrical Engineering and Computer Science. | en_US |
dc.title | Dimensionality reduction for k-means clustering | en_US |
dc.type | Thesis | en_US |
dc.description.degree | S.M. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.identifier.oclc | 940972715 | en_US |