Interpreting Deep Visual Representations via Network Dissection

Zhou, Bolei; Bau, David; Oliva, Aude; Torralba, Antonio

Author(s)

Zhou, Bolei; Bau, David; Oliva, Aude; Torralba, Antonio

DownloadAccepted version (8.221Mb)

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

The success of recent deep convolutional neural networks (CNNs) depends on learning hidden representations that can summarize the important factors of variation behind the data. In this work, we describe Network Dissection, a method that interprets networks by providing meaningful labels to their individual units. The proposed method quantifies the interpretability of CNN representations by evaluating the alignment between individual hidden units and visual semantic concepts. By identifying the best alignments, units are given interpretable labels ranging from colors, materials, textures, parts, objects and scenes. The method reveals that deep representations are more transparent and interpretable than they would be under a random equivalently powerful basis. We apply our approach to interpret and compare the latent representations of several network architectures trained to solve a wide range of supervised and self-supervised tasks. We then examine factors affecting the network interpretability such as the number of the training iterations, regularizations, different initialization parameters, as well as networks depth and width. Finally we show that the interpreted units can be used to provide explicit explanations of a given CNN prediction for an image. Our results highlight that interpretability is an important property of deep neural networks that provides new insights into what hierarchical structures can learn. Keywords: Convolutional neural networks; Network interpretability; Visual recognition; Interpretable machine learning; Visualization; Detectors; Training; Image color analysis; Task analysis; Image segmentation; Semantics

Date issued

2019-09-01

URI

https://hdl.handle.net/1721.1/122817

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Journal

IEEE Transactions on Pattern Analysis and Machine Intelligence

Publisher

Institute of Electrical and Electronics Engineers

Citation

Zhou, Bolei et. al. "Interpreting Deep Visual Representations via Network Dissection." Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 9 (September 2019): pp. 2131-2145 © Institute of Electrical and Electronics Engineers 2019

Version: Author's final manuscript

ISSN

0162-8828

2160-9292

1939-3539

Keywords

Computational Theory and Mathematics, Software, Applied Mathematics, Artificial Intelligence, Computer Vision and Pattern Recognition

Collections

MIT Open Access Articles