What Do Different Evaluation Metrics Tell Us About Saliency Models?

Bylinskii, Zoya; Judd, Tilke; Oliva, Aude; Torralba, Antonio; Durand, Frederic

Author(s)

Bylinskii, Zoya; Judd, Tilke; Oliva, Aude; Torralba, Antonio; Durand, Frederic

DownloadSubmitted version (6.078Mb)

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

How best to evaluate a saliency model's ability to predict where humans look in images is an open research question. The choice of evaluation metric depends on how saliency is defined and how the ground truth is represented. Metrics differ in how they rank saliency models, and this results from how false positives and false negatives are treated, whether viewing biases are accounted for, whether spatial deviations are factored in, and how the saliency maps are pre-processed. In this paper, we provide an analysis of 8 different evaluation metrics and their properties. With the help of systematic experiments and visualizations of metric computations, we add interpretability to saliency scores and more transparency to the evaluation of saliency models. Building off the differences in metric properties and behaviors, we make recommendations for metric selections under specific assumptions and for specific applications.

Date issued

2019-03

URI

https://hdl.handle.net/1721.1/129414

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

IEEE Transactions on Pattern Analysis and Machine Intelligence

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

Version: Original manuscript

ISSN

0162-8828

2160-9292

1939-3539

Collections

MIT Open Access Articles