Show simple item record

dc.contributor.authorBau, David
dc.contributor.authorZhou, Bolei
dc.contributor.authorKhosla, Aditya
dc.contributor.authorOliva, Aude
dc.contributor.authorTorralba, Antonio
dc.date.accessioned2020-05-01T19:35:43Z
dc.date.available2020-05-01T19:35:43Z
dc.date.issued2017-07
dc.identifier.isbn9781538604571
dc.identifier.urihttps://hdl.handle.net/1721.1/124985
dc.description.abstractWe propose a general framework called Network Dissection for quantifying the interpretability of latent representations of CNNs by evaluating the alignment between individual hidden units and a set of semantic concepts. Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer. The units with semantics are given labels across a range of objects, parts, scenes, textures, materials, and colors. We use the proposed method to test the hypothesis that interpretability of units is equivalent to random linear combinations of units, then we apply our method to compare the latent representations of various networks when trained to solve different supervised and self-supervised training tasks. We further analyze the effect of training iterations, compare networks trained with different initializations, examine the impact of network depth and width, and measure the effect of dropout and batch normalization on the interpretability of deep visual representations. We demonstrate that the proposed method can shed light on characteristics of CNN models and training methods that go beyond measurements of their discriminative power. ©2017 Paper presented at the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), July 21-26, 2017, Honolulu, Hawaii.en_US
dc.language.isoen
dc.publisherIEEEen_US
dc.relation.isversionof10.1109/cvpr.2017.354en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleNetwork dissection: quantifying interpretability of deep visual representationsen_US
dc.typeArticleen_US
dc.identifier.citationBau, David, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba, "Network dissection: quantifying interpretability of deep visual representations." Proceedings, 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) (Piscataway, N.J.: IEEE, 2017): p. 3319-27 doi 10.1109/cvpr.2017.354 ©2017 Author(s)en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.relation.journalProceedings, 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-07-11T16:40:34Z
dspace.date.submission2019-07-11T16:40:36Z
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record