Adversarial Examples in Simpler Settings
Author(s)
Wang, Tony T.
DownloadThesis PDF (1.774Mb)
Advisor
Wornell, Gregory W.
Terms of use
Metadata
Show full item recordAbstract
In this thesis we explore adversarial examples for simple model families and simple data distributions, focusing in particular on linear and kernel classifiers. On the theoretical front we find evidence that natural accuracy and robust accuracy are more likely than not to be misaligned. We conclude from this that in order to learn a robust classifier, one should explicitly aim for it either via a good choice of model family or via optimizing explicitly for robust accuracy. On the empirical front we discover that kernel classifiers and neural networks are non-robust in similar ways. This suggests that a better understanding of kernel classifier robustness may help unravel some of the mysteries of adversarial examples.
Date issued
2021-06Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology