A Detailed Look at Scale and Translation Invariance in a Hierarchical Neural Model of Visual Object Recognition

Schneider, Robert; Riesenhuber, Maximilian

Author(s)

Schneider, Robert; Riesenhuber, Maximilian

DownloadAIM-2002-011.ps (2.038Mb)

Additional downloads

AIM-2002-011.pdf (1.013Mb)

Metadata

Show full item record

Abstract

The HMAX model has recently been proposed by Riesenhuber & Poggio as a hierarchical model of position- and size-invariant object recognition in visual cortex. It has also turned out to model successfully a number of other properties of the ventral visual stream (the visual pathway thought to be crucial for object recognition in cortex), and particularly of (view-tuned) neurons in macaque inferotemporal cortex, the brain area at the top of the ventral stream. The original modeling study only used ``paperclip'' stimuli, as in the corresponding physiology experiment, and did not explore systematically how model units' invariance properties depended on model parameters. In this study, we aimed at a deeper understanding of the inner workings of HMAX and its performance for various parameter settings and ``natural'' stimulus classes. We examined HMAX responses for different stimulus sizes and positions systematically and found a dependence of model units' responses on stimulus position for which a quantitative description is offered. Interestingly, we find that scale invariance properties of hierarchical neural models are not independent of stimulus class, as opposed to translation invariance, even though both are affine transformations within the image plane.

Date issued

2002-08-01

URI

http://hdl.handle.net/1721.1/7178

Other identifiers

AIM-2002-011

CBCL-218

Series/Report no.

AIM-2002-011CBCL-218

DSpace@MIT