The anatomy of visual pattern acquisition in deep learning

Shafiullah, Nur Muhammad Mahi.

Author(s)

Shafiullah, Nur Muhammad Mahi.

Download1227511802-MIT.pdf (2.094Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Aleksander Ma̧dry.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Conventional wisdom says that convolutional neural networks use their filter hierarchy to learn feature structures. Yet very little work has been done to understand how the deep networks acquire those complicated features. Moreover, this blind spot opens up neural networks to nefarious security threats like data poisoning based backdoor attacks [Gu et al., 2017]. In this thesis, we study the question of how neural networks acquire particular visual patterns and how that behavior evolves with the learning rate. We aim to understand which patterns the neural network learns and how it prioritizes different patterns over one another over the learning process. We use a backdoor attack-based toolkit to analyze the learning process of the network and identify two fundamental properties of patterns that determine their fate: how often they appear in the dataset, and how strongly they correlate with particular labels. We empirically show how such properties determine how fast the network acquires patterns and what weight it puts on them. Finally, we propose a hypothesis that ties the pattern learning with the gradient from the network, and conclude by presenting a couple of experiments to support our claim.

Description

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020

Cataloged from student-submitted PDF of thesis.

Includes bibliographical references (pages 65-68).

Date issued

2020

URI

https://hdl.handle.net/1721.1/129226

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses