Advanced Search
DSpace@MIT

A Benchmark of Computational Models of Saliency to Predict Human Fixations

Research and Teaching Output of the MIT Community

Show simple item record

dc.contributor.advisor Frédo Durand
dc.contributor.author Judd, Tilke en_US
dc.contributor.author Durand, Frédo en_US
dc.contributor.author Torralba, Antonio en_US
dc.contributor.other Computer Graphics en
dc.date.accessioned 2012-01-13T22:30:12Z
dc.date.available 2012-01-13T22:30:12Z
dc.date.issued 2012-01-13
dc.identifier.uri http://hdl.handle.net/1721.1/68590
dc.description.abstract Many computational models of visual attention have been created from a wide variety of different approaches to predict where people look in images. Each model is usually introduced by demonstrating performances on new images, and it is hard to make immediate comparisons between models. To alleviate this problem, we propose a benchmark data set containing 300 natural images with eye tracking data from 39 observers to compare model performances. We calculate the performance of 10 models at predicting ground truth fixations using three different metrics. We provide a way for people to submit new models for evaluation online. We find that the Judd et al. and Graph-based visual saliency models perform best. In general, models with blurrier maps and models that include a center bias perform well. We add and optimize a blur and center bias for each model and show improvements. We compare performances to baseline models of chance, center and human performance. We show that human performance increases with the number of humans to a limit. We analyze the similarity of different models using multidimensional scaling and explore the relationship between model performance and fixation consistency. Finally, we offer observations about how to improve saliency models in the future. en_US
dc.format.extent 22 p. en_US
dc.relation.ispartofseries MIT-CSAIL-TR-2012-001
dc.rights Creative Commons Attribution 3.0 Unported en
dc.rights.uri http://creativecommons.org/licenses/by/3.0/
dc.subject fixation maps, saliency maps, vision en_US
dc.title A Benchmark of Computational Models of Saliency to Predict Human Fixations en_US
dc.language.rfc3066 en-US


Files in this item

Name Size Format Description
MIT-CSAIL-TR-2012 ... 48.23Mb PDF
supplementalMater ... 8.317Mb PDF

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Creative Commons Attribution 3.0 Unported Except where otherwise noted, this item's license is described as Creative Commons Attribution 3.0 Unported
MIT-Mirage