Show simple item record

dc.contributor.advisorJonathan A. Kelner.en_US
dc.contributor.authorKishore, Shaunaken_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Mathematics.en_US
dc.date.accessioned2013-03-01T15:27:03Z
dc.date.available2013-03-01T15:27:03Z
dc.date.copyright2012en_US
dc.date.issued2012en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/77534
dc.descriptionThesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science; and, (S.B.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2012.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (p. 18).en_US
dc.description.abstractWe obtain improved running times for two algorithms for clustering data: the expectation-maximization (EM) algorithm and Lloyd's algorithm. The EM algorithm is a heuristic for finding a mixture of k normal distributions in Rd that maximizes the probability of drawing n given data points. Lloyd's algorithm is a special case of this algorithm in which the covariance matrix of each normally-distributed component is required to be the identity. We consider versions of these algorithms where the number of mixture components is inferred by assuming a Dirichlet process as a generative model. The separation probability of this process, [alpha], is typically a small constant. We speed up each iteration of the EM algorithm from O(nd2k) to O(ndk log 3(k/a))+nd 2 ) time and each iteration of Lloyd's algorithm from O(ndk) to O(nd(k/a). 39) time.en_US
dc.description.statementofresponsibilityby Shaunak Kishore.en_US
dc.format.extent18 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.subjectMathematics.en_US
dc.titleAccelerated clustering through locality-sensitive hashingen_US
dc.typeThesisen_US
dc.description.degreeS.B.en_US
dc.description.degreeM.Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematics
dc.identifier.oclc826515141en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record