Adversarial examples are not bugs, they are features

Ilyas, A; Santurkar, S; Tsipras, D; Engstrom, L; Tran, B; Madry, A

dc.contributor.author	Ilyas, A
dc.contributor.author	Santurkar, S
dc.contributor.author	Tsipras, D
dc.contributor.author	Engstrom, L
dc.contributor.author	Tran, B
dc.contributor.author	Madry, A
dc.date.accessioned	2021-11-05T15:00:20Z
dc.date.available	2021-11-05T15:00:20Z
dc.date.issued	2019
dc.identifier.uri	https://hdl.handle.net/1721.1/137500
dc.description.abstract	© 2019 Neural information processing systems foundation. All rights reserved. Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features (derived from patterns in the data distribution) that are highly predictive, yet brittle and (thus) incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.	en_US
dc.language.iso	en
dc.relation.isversionof	https://papers.nips.cc/paper/2019/hash/e2c420d928d4bf8ce0ff2ec19b371514-Abstract.html	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	Neural Information Processing Systems (NIPS)	en_US
dc.title	Adversarial examples are not bugs, they are features	en_US
dc.type	Article	en_US
dc.identifier.citation	Ilyas, A, Santurkar, S, Tsipras, D, Engstrom, L, Tran, B et al. 2019. "Adversarial examples are not bugs, they are features." Advances in Neural Information Processing Systems, 32.
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.relation.journal	Advances in Neural Information Processing Systems	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-02-02T14:05:36Z
dspace.orderedauthors	Ilyas, A; Santurkar, S; Tsipras, D; Engstrom, L; Tran, B; Madry, A	en_US
dspace.date.submission	2021-02-02T14:05:43Z
mit.journal.volume	32	en_US
mit.license	PUBLISHER_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: NeurIPS-2019-adversarial-examp ...
Size:: 1.494Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record