Adversarial Robustness Guarantees for Random Deep Neural Networks

De Palma, Giacomo; Kiani, Bobak T; Lloyd, Seth

Author(s)

De Palma, Giacomo; Kiani, Bobak T; Lloyd, Seth

DownloadPublished version (952.7Kb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

The reliability of deep learning algorithms is fundamentally challenged by the existence of adversarial examples, which are incorrectly classified inputs that are extremely close to a correctly classified input. We explore the properties of adversarial examples for deep neural networks with random weights and biases, and prove that for any p≥1, the \ell^p distance of any given input from the classification boundary scales as one over the square root of the dimension of the input times the \ell^p norm of the input. The results are based on the recently proved equivalence between Gaussian processes and deep neural networks in the limit of infinite width of the hidden layers, and are validated with experiments on both random deep neural networks and deep neural networks trained on the MNIST and CIFAR10 datasets. The results constitute a fundamental advance in the theoretical understanding of adversarial examples, and open the way to a thorough theoretical characterization of the relation between network architecture and robustness to adversarial perturbations.

Date issued

2021

URI

https://hdl.handle.net/1721.1/138868.2

Department

Massachusetts Institute of Technology. Department of Mechanical Engineering; Massachusetts Institute of Technology. Research Laboratory of Electronics; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139

Citation

De Palma, Giacomo, Kiani, Bobak T and Lloyd, Seth. 2021. "Adversarial Robustness Guarantees for Random Deep Neural Networks." INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 139.

Version: Final published version

Collections

MIT Open Access Articles

Version	Item	Date	Summary
2	1721.1/138868.2*	2022-01-10T20:29:37Z	Authority information verified/added.
1	1721.1/138868	2022-01-10T19:45:56Z

DSpace@MIT