MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

High-dimensional similarity search and sketching : algorithms and hardness

Author(s)
Razenshteyn, Ilya
Thumbnail
DownloadFull printable version (2.171Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Piotr Indyk.
Terms of use
MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
We study two fundamental problems that involve massive high-dimensional datasets: approximate near neighbor search (ANN) and sketching. We obtain a number of new results including: ' An algorithm for the ANN problem over the ℓ₁ and ℓ₂ distances that, for the first time, improves upon the Locality-Sensitive Hashing (LSH) framework. The key new insight is to use random space partitions that depend on the dataset. ' An implementation of the core component of the above algorithm, which is released as FALCONN: a new C++ library for high-dimensional similarity search. ' An efficient algorithm for the ANN problem over any distance that can be expressed as a symmetric norm. ' For norms, we establish the equivalence between the existence of short and accurate sketches and good embeddings into ℓp spaces for 0 < p </- 2. We use this equivalence to show the first sketching lower bound for the Earth Mover's Distance (EMD).
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
 
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references (pages 241-255).
 
Date issued
2017
URI
http://hdl.handle.net/1721.1/113934
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.