Generalization of deep neural networks to unseen attribute combinations
Author(s)Henry, Timothy G.
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
MetadataShow full item record
Visual understanding results from a combined understanding of primitive visual attributes such as color, texture, and shape. This allows humans and other primates to generalize their understanding of objects to new combinations of attributes. For instance, one can understand that a pink elephant is an elephant even if they have never seen this particular combination of color and shape before. However, is it the case that deep neural networks (DNNs) are able to generalize to such novel combinations in object recognition or other related vision tasks? This thesis demonstrates that (1) the ability of DNNs to generalize to unseen attribute combinations increases with the increased diversity of combinations seen in training as a percentage of the total combination space, (2) this effect is largely independent of the specifics of the DNN architecture used, (3) while single-task and multi-task formulations of supervised attribute classification problems may lead to similar performance on seen combinations, single-task formulations have a superior ability to generalize to unseen combinations, and (4) DNNs demonstrating the ability to generalize well in this setting learn to do so by leveraging emergent hidden units that exhibit properties of attribute selectivity and invariance.
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020Cataloged from student-submitted PDF of thesis.Includes bibliographical references (pages 71-73).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.