A normalization model of visual search predicts single trial human fixations in an object search task.
Author(s)Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
MetadataShow full item record
When searching for an object in a scene, how does the brain decide where to look next? Theories of visual search suggest the existence of a global attentional map, computed by integrating bottom-up visual information with top-down, target-specific signals. Where, when and how this integration is performed remains unclear. Here we describe a simple mechanistic model of visual search that is consistent with neurophysiological and neuroanatomical constraints, can localize target objects in complex scenes, and predicts single-trial human behavior in a search task among complex objects. This model posits that target-specific modulation is applied at every point of a retinotopic area selective for complex visual features and implements local normalization through divisive inhibition. The combination of multiplicative modulation and divisive normalization creates an attentional map in which aggregate activity at any location tracks the correlation between input and target features, with relative and controllable independence from bottom-up saliency. We first show that this model can localize objects in both composite images and natural scenes and demonstrate the importance of normalization for successful search. We next show that this model can predict human fixations on single trials, including error and target-absent trials. We argue that this simple model captures non-trivial properties of the attentional system that guides visual search in humans.
Center for Brains, Minds and Machines (CBMM), arXiv
CBMM Memo Series;008
Pattern Recognition, Vision, Neuroscience
The following license files are associated with this item: