Leveraging dataset examples for the interpretation of back-box deep learning models

Kherraz, Houssam.

Author(s)

Kherraz, Houssam.

Download1192561536-MIT.pdf (32.87Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Arvind Satyanarayan.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

With growing concerns over how machine learning models behave in deployment, people in academia and industry are more interested than ever in gaining insights into the inner workings of these black-box models. Yet, the current toolbox to understand neural networks is limited. In this work, I propose a new tool, called the Neuron Activation Sorter (NAS), centered around a new paradigm in machine learning interpretability. This new framing aims to use dataset examples as the main interaction tool to learn about the model. The Neuron Activation Sorter (NAS) operates at different levels of granularity through two modes. The Individual Neuron mode operates at the neuron level, while the Layer Summary mode operates at the layer level. The Layer Summary mode shows the distribution of different classes over activation values for each neuron of a specific layer through a histogram of stacked charts. The Individual Neuron mode further explores that distribution by exposing all the dataset images in the histogram visually. Together, they provide intuition about both micro and macro behaviors. I explore how these tools can leverage dataset items to both intuitively draw conclusions on the inner workings of a model and form hypotheses on potential failures. I give concrete examples on the insights they provide by exploring two neural networks: a basic 5-layer Convolutional Neural Network trained on the Quickdraw dataset and a VGG-16 model trained on Imagenet. Both examples expose a taxonomy of neurons and particular insights that are hard to access through other tools like feature visualizations or saliency maps.

Description

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020

Cataloged from the official PDF of thesis.

Includes bibliographical references (pages 55-57).

Date issued

2020

URI

https://hdl.handle.net/1721.1/127417

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses