Compositional simulation in perception and cognition
Author(s)Siegel, Max Harmon.
Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences.
Joshua B. Tenenbaum.
MetadataShow full item record
Despite rapid recent progress in machine perception and models of biological perception, fundamental questions remain open. In particular, the paradigm underlying these advances, pattern recognition, requires large amounts of training data and struggles to generalize to situations outside the domain of training. In this thesis, I focus on a broad class of perceptual concepts - those that are generated by the composition of multiple causal processes, in this case certain physical interactions - that human use essentially and effortlessly in making sense of the world, but for which any specific instance is extremely rare in our experience. Pattern recognition, or any strongly learning-based approach, might then be an inappropriate way to understand people's perceptual inferences.I propose an alternative approach, compositional simulation, that can in principle account for these inferences, and I show in practice that it provides both qualitative and quantitative explanatory value for several experimental settings. Consider a box and a number of marbles in the box, and imagine trying to guess how many there are based on the sound produced when the box is shaken. I demonstrate that human observers are quite good at this task, even for subtle numerical differences. Compositional simulation hypothesizes that people succeed by leveraging internal causal models: they simulate the physical collisions that would result from shaking the box (in a particular way), and what those collisions would sound like, for different numbers of marbles. They then compare their simulated sounds with the sound they heard.Crucially these simulation models can generalize to a wide range of percepts, even those never before experienced, by exploiting the compositional structure of the causal processes being modeled, in terms of objects and their interactions, and physical dynamics and auditory events. Because the motion of the box is a key ingredient in physical simulation, I hypothesize that people can take cues to motion into account in our task; I give evidence that people do. I also consider the domain of unfamiliar objects covered by cloth. a similar mechanism should enable successful recognition even for unfamiliar covered objects (like airplanes). I show that people can succeed in the recognition task, even when the shape of the object is very different when covered. Finally, I show how compositional simulation provides a way to "glue together" the data received by perception (images and sounds) with the contents of cognition (objects).I apply compositional simulation to two cognitive domains: children's intuitive exploration (obtaining quantitative prediction of exploration time), and causal inference from audiovisual information.
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2019Cataloged from PDF version of thesis. "February 2019."Includes bibliographical references (pages 97-103).
DepartmentMassachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Massachusetts Institute of Technology
Brain and Cognitive Sciences.