Do Deep Neural Networks Suffer from Crowding?

Volokitin, Anna; Roig, Gemma; Poggio, Tomaso

The MIT Libraries is completing a major upgrade to DSpace@MIT. Starting May 5 2026, DSpace will remain functional, viewable, searchable, and downloadable, however, you will not be able to edit existing collections or add new material. We are aiming to have full functionality restored by May 18, 2026 but intermittent service interruptions may occur. Please email dspace-lib@mit.edu with any questions. Thank you for your patience as we implement this important upgrade.

Author(s)

Volokitin, Anna; Roig, Gemma; Poggio, Tomaso

DownloadCBMM-Memo-069.pdf (6.465Mb)

Terms of use

Attribution-NonCommercial-ShareAlike 3.0 United States http://creativecommons.org/licenses/by-nc-sa/3.0/us/

Metadata

Show full item record

Abstract

Crowding is a visual effect suffered by humans, in which an object that can be recognized in isolation can no longer be recognized when other objects, called flankers, are placed close to it. In this work, we study the effect of crowding in artificial Deep Neural Networks for object recognition. We analyze both standard deep convolutional neural networks (DCNNs) as well as a new version of DCNNs which is 1) multi-scale and 2) with size of the convolution filters change depending on the eccentricity wrt to the center of fixation. Such networks, that we call eccentricity-dependent, are a computational model of the feedforward path of the primate visual cortex. Our results reveal that the eccentricity-dependent model, trained on target objects in isolation, can recognize such targets in the presence of flankers, if the targets are near the center of the image, whereas DCNNs cannot. Also, for all tested networks, when trained on targets in isolation, we find that recognition accuracy of the networks decreases the closer the flankers are to the target and the more flankers there are. We find that visual similarity between the target and flankers also plays a role and that pooling in early layers of the network leads to more crowding. Additionally, we show that incorporating the flankers into the images of the training set does not improve performance with crowding.

Date issued

2017-06-26

URI

http://hdl.handle.net/1721.1/110348

Publisher

Center for Brains, Minds and Machines (CBMM), arXiv

Citation

arXiv:1706.08616

Series/Report no.

CBMM Memo Series;069

Keywords

Deep Neural Networks, deep convolutional neural networks, DCNN, eccentricity-dependent

Collections

CBMM Memo Series

The following license files are associated with this item:

Creative Commons