| dc.contributor.advisor | Piotr Indyk. | en_US |
| dc.contributor.author | Razenshteyn, Ilya | en_US |
| dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
| dc.date.accessioned | 2018-03-02T21:39:53Z | |
| dc.date.available | 2018-03-02T21:39:53Z | |
| dc.date.copyright | 2017 | en_US |
| dc.date.issued | 2017 | en_US |
| dc.identifier.uri | http://hdl.handle.net/1721.1/113934 | |
| dc.description | Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017. | en_US |
| dc.description | This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. | en_US |
| dc.description | Cataloged from student-submitted PDF version of thesis. | en_US |
| dc.description | Includes bibliographical references (pages 241-255). | en_US |
| dc.description.abstract | We study two fundamental problems that involve massive high-dimensional datasets: approximate near neighbor search (ANN) and sketching. We obtain a number of new results including: ' An algorithm for the ANN problem over the ℓ₁ and ℓ₂ distances that, for the first time, improves upon the Locality-Sensitive Hashing (LSH) framework. The key new insight is to use random space partitions that depend on the dataset. ' An implementation of the core component of the above algorithm, which is released as FALCONN: a new C++ library for high-dimensional similarity search. ' An efficient algorithm for the ANN problem over any distance that can be expressed as a symmetric norm. ' For norms, we establish the equivalence between the existence of short and accurate sketches and good embeddings into ℓp spaces for 0 < p </- 2. We use this equivalence to show the first sketching lower bound for the Earth Mover's Distance (EMD). | en_US |
| dc.description.statementofresponsibility | by Ilya Razenshteyn. | en_US |
| dc.format.extent | 255 pages | en_US |
| dc.language.iso | eng | en_US |
| dc.publisher | Massachusetts Institute of Technology | en_US |
| dc.rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. | en_US |
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
| dc.subject | Electrical Engineering and Computer Science. | en_US |
| dc.title | High-dimensional similarity search and sketching : algorithms and hardness | en_US |
| dc.type | Thesis | en_US |
| dc.description.degree | Ph. D. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| dc.identifier.oclc | 1023861862 | en_US |