Sculpting representations for deep learning

Rippel, Oren

Author(s)

Rippel, Oren

DownloadFull printable version (16.04Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Mathematics.

Advisor

Ryan P. Adams and Ankur Moitra.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

In machine learning, the choice of space in which to represent our data is of vital importance to their effective and efficient analysis. In this thesis, we develop approaches to address a number of problems in representation learning. We employ deep learning as means of sculpting our representations, and also develop improved representations for deep learning models. We present contributions that are based on five papers and make progress in several different research directions. First, we present techniques which leverage spatial and relational structure to achieve greater computational efficiency of model optimization and query retrieval. This allows us to train distance metric learning models 5-30 times faster; optimize convolutional neural networks 2-5 times faster; perform content-based image retrieval hundreds of times faster on codes hundreds of times longer than feasible before; and improve the complexity of Bayesian optimization to linear in the number of observations in contrast to the cubic dependence in its naive Gaussian process formulation. Furthermore, we introduce ideas to facilitate preservation of relevant information within the learned representations, and demonstrate this leads to improved supervision results. Our approaches achieve state-of-the-art classification and transfer learning performance on a number of well-known machine learning benchmarks. In addition, while deep learning models are able to discover structure in high dimensional input domains, they only offer implicit probabilistic descriptions. We develop an algorithm to enable probabilistic interpretability of deep representations. It constructs a transformation to a representation space under which the map of the distribution is approximately factorized and has known marginals. This allows tractable density estimation and.inference within this alternate domain.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2016.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 149-164).

Date issued

2016

URI

http://hdl.handle.net/1721.1/104581

Department

Massachusetts Institute of Technology. Department of Mathematics

Publisher

Massachusetts Institute of Technology

Keywords

Mathematics.

Collections

Doctoral Theses