A dimension reduction technique to preserve nearest neighbors on high dimensional data
Author(s)
Chachamis, Christos Nestor.
Download1192539440-MIT.pdf (1.464Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Samuel Madden.
Terms of use
Metadata
Show full item recordAbstract
Dimension reduction techniques are widely used for various tasks, including visualizations and data pre-processing. In this project, we develop a new dimension-reduction method that helps with the problem of Approximate Nearest Neighbor Search on high dimensional data. It uses a deep neural network to reduce the data to a lower dimension, while also preserving nearest neighbors and local structure. We evaluate the performance of this network on several datasets, including synthetic and real ones, and, finally, we compare our method against other dimension reduction techniques, like tSNE. Our experiment results show that this method can sufficiently preserve the local structure, in both the training and test data. In particular, we observe that most of the distances of the predicted nearest neighbors in the test data are within 10% of the distances of the actual nearest neighbors. Another advantage of our method is that it can easily work on new and unseen data, without having to fit the model from scratch.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 71-72).
Date issued
2020Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.