Visual concept-metaconcept learning
Author(s)
Han, Chi; Mao, Jiayuan; Gan, Chuang; Tenenbaum, Joshua B.; Wu, Jiajun
DownloadPublished version (1.915Mb)
Publisher Policy
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
© 2019 Neural information processing systems foundation. All rights reserved. Humans reason with concepts and metaconcepts: we recognize red and green from visual input; we also understand that they describe the same property of objects (i.e., the color). In this paper, we propose the visual concept-metaconcept learner (VCML) for joint learning of concepts and metaconcepts from images and associated question-answer pairs. The key is to exploit the bidirectional connection between visual concepts and metaconcepts. Visual representations provide grounding cues for predicting relations between unseen pairs of concepts. Knowing that red and green describe the same property of objects, we generalize to the fact that cube and sphere also describe the same property of objects, since they both categorize the shape of objects. Meanwhile, knowledge about metaconcepts empowers visual concept learning from limited, noisy, and even biased data. From just a few examples of purple cubes we can understand a new color purple, which resembles the hue of the cubes instead of the shape of them. Evaluation on both synthetic and real-world datasets validates our claims.
Date issued
2019-01Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; MIT-IBM Watson AI Lab; Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences; Center for Brains, Minds, and MachinesJournal
Advances in Neural Information Processing Systems
Citation
Han, C, Mao, J, Gan, C, Tenenbaum, JB and Wu, J. 2019. "Visual concept-metaconcept learning." Advances in Neural Information Processing Systems, 32.
Version: Final published version