dc.contributor.advisor | Ankur Moitra. | en_US |
dc.contributor.author | Persu, Elena-Mădălina | en_US |
dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
dc.date.accessioned | 2015-11-09T19:52:49Z | |
dc.date.available | 2015-11-09T19:52:49Z | |
dc.date.copyright | 2015 | en_US |
dc.date.issued | 2015 | en_US |
dc.identifier.uri | http://hdl.handle.net/1721.1/99847 | |
dc.description | Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. | en_US |
dc.description | Cataloged from PDF version of thesis. | en_US |
dc.description | Includes bibliographical references (pages 39-41). | en_US |
dc.description.abstract | Using random row projections, we show how to approximate a data matrix A with a much smaller sketch à that can be used to solve a general class of constrained k-rank approximation problems to within (1 + [epsilon]) error. Importantly, this class of problems includes k-means clustering. By reducing data points to just O(k) dimensions, our methods generically accelerate any exact, approximate, or heuristic algorithm for these ubiquitous problems. For k-means dimensionality reduction, we provide (1+ [epsilon]) relative error results for random row projections which improve on the (2 + [epsilon]) prior known constant factor approximation associated with this sketching technique, while preserving the number of dimensions. For k-means clustering, we show how to achieve a (9 + [epsilon]) approximation by Johnson-Lindenstrauss projecting data points to just 0(log k/[epsilon]2 ) dimensions. This gives the first result that leverages the specific structure of k-means to achieve dimension independent of input size and sublinear in k. | en_US |
dc.description.statementofresponsibility | by Elena-Mădălina Persu. | en_US |
dc.format.extent | 41 pages | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Massachusetts Institute of Technology | en_US |
dc.rights | M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. | en_US |
dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
dc.subject | Electrical Engineering and Computer Science. | en_US |
dc.title | Approximate k-means clustering through random projections | en_US |
dc.type | Thesis | en_US |
dc.description.degree | S.M. in Computer Science and Engineering | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.identifier.oclc | 927412847 | en_US |