Show simple item record

dc.contributor.advisorTommi Jaakkola.en_US
dc.contributor.authorMalalur, Paresh(Paresh G.)en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2019-11-04T19:53:13Z
dc.date.available2019-11-04T19:53:13Z
dc.date.copyright2019en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/122686
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019en_US
dc.descriptionCataloged from student-submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 145-150).en_US
dc.description.abstractIn this thesis, we aim to develop methodologies to better understand and improve the performance of Deep Neural Networks in various settings where data is limited or missing. Unlike data-rich tasks where neural networks have achieved human-level performance, other problems are naturally data limited where these models have fallen short of human level performance and where there is abundant room for improvement. We focus on three types of problems where data is limited - one-shot learning and open-set recognition in the one-shot setting, unsupervised learning, and classification with missing data. The first setting of limited data that we tackle is when there are only few examples per object type. During object classification, an attention mechanism can be used to highlight the area of the image that the model focuses on thus offering a narrow view into the mechanism of classification.en_US
dc.description.abstractWe expand on this idea by forcing the method to explicitly align images to be classified to reference images representing the classes. The mechanism of alignment is learned and therefore does not require that the reference objects are anything like those being classified. Beyond explanation, our exemplar based cross-alignment method enables classification with only a single example per category (one-shot) or in the absence of any labels about new classes (open-set). While one-shot and open-set recognition operate in cases where complete data is available for few examples, unsupervised and missing data setting focus on cases where the labels are missing or where only partial input is available correspondingly. Variational Auto-encoders are a popular unsupervised learning model which learn how to map the input distribution into a simple latent distribution.en_US
dc.description.abstractWe introduce a mechanism of approximate propagation of Gaussian densities through neural networks using the Hellinger distance metric to find the best approximation and demonstrate how to use this framework to improve the latent code efficiency of Variational Auto- Encoders. Expanding on this idea further, we introduce a novel method to learn the mapping between the input space and latent space which further improves the efficiency of the latent code by overcoming the variational bound. The final limited data setting we explore is when the input data is incomplete or very noisy. Neural Networks are inherently feed-forward and hence inference methods developed for probabilistic models can not be applied directly. We introduce two different methods to handle missing data. We first introduce a simple feed-forward model that redefines the linear operator as an ensemble to reweight the activations when portions of its receptive field are missing.en_US
dc.description.abstractWe then use some of the insights gained to develop deep networks that propagate distributions of activations instead of point activations allowing us to use message passing methods to compensate for missing data while maintaining the feed-forward style approach when data is not missing.en_US
dc.description.statementofresponsibilityby Paresh Malalur.en_US
dc.format.extent150 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleInterpretable neural networks via alignment and dpstribution Propagationen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc1124682413en_US
dc.description.collectionPh.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienceen_US
dspace.imported2019-11-04T19:53:12Zen_US
mit.thesis.degreeDoctoralen_US
mit.thesis.departmentEECSen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record