Show simple item record

dc.contributor.advisorPiotr Indyk.en_US
dc.contributor.authorRazenshteyn, Ilyaen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2018-03-02T21:39:53Z
dc.date.available2018-03-02T21:39:53Z
dc.date.copyright2017en_US
dc.date.issued2017en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/113934
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionCataloged from student-submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 241-255).en_US
dc.description.abstractWe study two fundamental problems that involve massive high-dimensional datasets: approximate near neighbor search (ANN) and sketching. We obtain a number of new results including: ' An algorithm for the ANN problem over the ℓ₁ and ℓ₂ distances that, for the first time, improves upon the Locality-Sensitive Hashing (LSH) framework. The key new insight is to use random space partitions that depend on the dataset. ' An implementation of the core component of the above algorithm, which is released as FALCONN: a new C++ library for high-dimensional similarity search. ' An efficient algorithm for the ANN problem over any distance that can be expressed as a symmetric norm. ' For norms, we establish the equivalence between the existence of short and accurate sketches and good embeddings into ℓp spaces for 0 < p </- 2. We use this equivalence to show the first sketching lower bound for the Earth Mover's Distance (EMD).en_US
dc.description.statementofresponsibilityby Ilya Razenshteyn.en_US
dc.format.extent255 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleHigh-dimensional similarity search and sketching : algorithms and hardnessen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc1023861862en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record