Investigating the Role of Biological Constraints in Adversarial Robustness via Modeling and Representational Geometry

Le Thi Nguyet, Hang

Author(s)

Le Thi Nguyet, Hang

DownloadThesis PDF (7.830Mb)

Advisor

DiCarlo, James

Chung, SueYeon

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Although deep neural networks (DNNs) achieve excellent performance and even outperform humans on various computer vision tasks, the robustness of DNNs to small perturbations is still far from being comparable to the human visual system. Indeed, adversarial attacks, which are very small worst-case perturbations, can reduce the accuracy of state-of-the-art models dramatically to close to random chance while remaining humanly indistinguishable. Since the human visual system has a high tolerance to small input perturbations, Dapello et al developed VOneNet, a model with architecture similar to the V1 brain area as the front-end and standard DNNs architecture as the back-end, and demonstrated that VOneNet has significantly better adversarial robustness than the standard ResNet. In this work, we analyze the internal representations of adversarial examples to dissect how adversarial perturbations alter the geometric structure and encoded information of the representations and to understand how brain-like components such as representational noise and neural normalization can help to improve adversarial robustness. Firstly, we show that internal representations from adversarial examples are linearly separated and still encode a significant amount of class information. Secondly, we demonstrate that representational noise can create an overlap between noise-injected clean and adversarial examples, therefore improving the robustness of the model. Finally, we show that neural normalization, which is based on divisive normalization and lateral inhibition, achieves better adversarial performance compared to traditional normalization methods such as batch normalization, which is based on standardization.

Date issued

2021-06

URI

https://hdl.handle.net/1721.1/139227

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses