Show simple item record

dc.contributor.advisorAnkur Moitra.en_US
dc.contributor.authorPersu, Elena-Mădălinaen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2015-11-09T19:52:49Z
dc.date.available2015-11-09T19:52:49Z
dc.date.copyright2015en_US
dc.date.issued2015en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/99847
dc.descriptionThesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 39-41).en_US
dc.description.abstractUsing random row projections, we show how to approximate a data matrix A with a much smaller sketch à that can be used to solve a general class of constrained k-rank approximation problems to within (1 + [epsilon]) error. Importantly, this class of problems includes k-means clustering. By reducing data points to just O(k) dimensions, our methods generically accelerate any exact, approximate, or heuristic algorithm for these ubiquitous problems. For k-means dimensionality reduction, we provide (1+ [epsilon]) relative error results for random row projections which improve on the (2 + [epsilon]) prior known constant factor approximation associated with this sketching technique, while preserving the number of dimensions. For k-means clustering, we show how to achieve a (9 + [epsilon]) approximation by Johnson-Lindenstrauss projecting data points to just 0(log k/[epsilon]2 ) dimensions. This gives the first result that leverages the specific structure of k-means to achieve dimension independent of input size and sublinear in k.en_US
dc.description.statementofresponsibilityby Elena-Mădălina Persu.en_US
dc.format.extent41 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleApproximate k-means clustering through random projectionsen_US
dc.typeThesisen_US
dc.description.degreeS.M. in Computer Science and Engineeringen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc927412847en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record