Deep Attentional Modulation for Zero-shot Learning in Object Recognition

Singh, Aaditya

Author(s)

Singh, Aaditya

DownloadThesis PDF (2.835Mb)

Advisor

Katz, Boris

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

In the human brain, top-down attention plays a crucial role in the human ability to recognize seemingly infinite visual concepts using the same visual pathway. Even more impressive, humans have the ability to recognize objects from just a description (zero-shot) or a few examples (few-shot). Traditionally, artificial neural networks have struggled at reproducing this ability, with large performance drops in the zero-and few-shot domains caused by overfitting. Most methods are focusing on learning a good, fixed feature extractor, then tying those features to new classes using linear transformations, which are less prone to overfitting on few examples. On the opposite side of this spectrum of simpler models are meta-learning techniques that finetune whole feature extractors to fit the few examples. While both of these methods have shown reasonable success, we believe that a middle ground, taking into account inductive biases inspired biological attention, can lead to improved performance. In this work, we study the use of top-down attentional modulation, already shown to be useful in visual question answering, in the domain of zero- and few-shot object recognition. We find that deep modulation can be critical in distinguishing unseen classes from previously seen classes in the zero-shot setting, and also provides gains in distinguishing between unseen classes in the few-shot domain. We hope that the insights brought to light in this work can contribute to growing need for computer vision systems that generalize to novel concepts and new environments.

Date issued

2021-06

URI

https://hdl.handle.net/1721.1/139113

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses