MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Interpretable neural networks via alignment and dpstribution Propagation

Author(s)
Malalur, Paresh(Paresh G.)
Thumbnail
Download1124682413-MIT.pdf (2.222Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Tommi Jaakkola.
Terms of use
MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
In this thesis, we aim to develop methodologies to better understand and improve the performance of Deep Neural Networks in various settings where data is limited or missing. Unlike data-rich tasks where neural networks have achieved human-level performance, other problems are naturally data limited where these models have fallen short of human level performance and where there is abundant room for improvement. We focus on three types of problems where data is limited - one-shot learning and open-set recognition in the one-shot setting, unsupervised learning, and classification with missing data. The first setting of limited data that we tackle is when there are only few examples per object type. During object classification, an attention mechanism can be used to highlight the area of the image that the model focuses on thus offering a narrow view into the mechanism of classification.
 
We expand on this idea by forcing the method to explicitly align images to be classified to reference images representing the classes. The mechanism of alignment is learned and therefore does not require that the reference objects are anything like those being classified. Beyond explanation, our exemplar based cross-alignment method enables classification with only a single example per category (one-shot) or in the absence of any labels about new classes (open-set). While one-shot and open-set recognition operate in cases where complete data is available for few examples, unsupervised and missing data setting focus on cases where the labels are missing or where only partial input is available correspondingly. Variational Auto-encoders are a popular unsupervised learning model which learn how to map the input distribution into a simple latent distribution.
 
We introduce a mechanism of approximate propagation of Gaussian densities through neural networks using the Hellinger distance metric to find the best approximation and demonstrate how to use this framework to improve the latent code efficiency of Variational Auto- Encoders. Expanding on this idea further, we introduce a novel method to learn the mapping between the input space and latent space which further improves the efficiency of the latent code by overcoming the variational bound. The final limited data setting we explore is when the input data is incomplete or very noisy. Neural Networks are inherently feed-forward and hence inference methods developed for probabilistic models can not be applied directly. We introduce two different methods to handle missing data. We first introduce a simple feed-forward model that redefines the linear operator as an ensemble to reweight the activations when portions of its receptive field are missing.
 
We then use some of the insights gained to develop deep networks that propagate distributions of activations instead of point activations allowing us to use message passing methods to compensate for missing data while maintaining the feed-forward style approach when data is not missing.
 
Description
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references (pages 145-150).
 
Date issued
2019
URI
https://hdl.handle.net/1721.1/122686
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.