Center for Brains, Minds & Machines
https://hdl.handle.net/1721.1/88529
2021-02-27T04:45:33ZThe Effects of Image Distribution and Task on Adversarial Robustness
https://hdl.handle.net/1721.1/129813
The Effects of Image Distribution and Task on Adversarial Robustness
Kunhardt, Owen; Deza, Arturo; Poggio, Tomaso
In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular ε-interval [ε_0, ε_1] (interval of adversarial perturbation strengths) that facilitates unbiased comparisons across models when they have different initial ε_0 performance. This can be used to determine how adversarially robust a model is to different image distributions or task (or some other variable); and/or to measure how robust a model is comparatively to other models. We used this adversarial robustness metric on models of an MNIST, CIFAR-10, and a Fusion dataset (CIFAR-10 + MNIST) where trained models performed either a digit or object recognition task using a LeNet, ResNet50, or a fully connected network (FullyConnectedNet) architecture and found the following: 1) CIFAR-10 models are inherently less adversarially robust than MNIST models; 2) Both the image distribution and task that a model is trained on can affect the adversarial robustness of the resultant model. 3) Pretraining with a different image distribution and task sometimes carries over the adversarial robustness induced by that image distribution and task in the resultant model; Collectively, our results imply non-trivial differences of the learned representation space of one perceptual system over another given its exposure to different image statistics or tasks (mainly objects vs digits). Moreover, these results hold even when model systems are equalized to have the same level of performance, or when exposed to approximately matched image statistics of fusion images but with different tasks.
2021-02-18T00:00:00ZCross-validation Stability of Deep Networks
https://hdl.handle.net/1721.1/129744
Cross-validation Stability of Deep Networks
Banburski, Andrzej; De La Torre, Fernanda; Plant, Nishka; Shastri, Ishana; Poggio, Tomaso
Recent theoretical results show that gradient descent on deep neural networks under exponential loss functions locally maximizes classification margin, which is equivalent to minimizing the norm of the weight matrices under margin constraints. This property of the solution however does not fully characterize the generalization performance. We motivate theoretically and show empirically that the area under the curve of the margin distribution on the training set is in fact a good measure of generalization. We then show that, after data separation is achieved, it is possible to dynamically reduce the training set by more than 99% without significant loss of performance. Interestingly, the resulting subset of “high capacity” features is not consistent across different training runs, which is consistent with the theoretical claim that all training points should converge to the same asymptotic margin under SGD and in the presence of both batch normalization and weight decay.
2021-02-09T00:00:00ZFrom Associative Memories to Deep Networks
https://hdl.handle.net/1721.1/129402
From Associative Memories to Deep Networks
Poggio, Tomaso
About fifty years ago, holography was proposed as a model of associative memory. Associative memories with similar properties were soon after implemented as simple networks of threshold neurons by Willshaw and Longuet-Higgins. In these pages I will show that today’s deep nets are an incremental improvement of the original associative networks. Thinking about deep learning in terms of associative networks provides a more realistic and sober perspective on the promises of deep learning and on its role in eventually understanding human intelligence. As a bonus, this discussion also uncovers connections with several interesting topics in applied math: random features, random projections, neural ensembles, randomized kernels, memory and generalization, vector quantization and hierarchical vector quantization, random vectors and orthogonal basis, NTK and radial kernels.
2021-01-12T00:00:00ZDreaming with ARC
https://hdl.handle.net/1721.1/128607
Dreaming with ARC
Banburski, Andrzej; Ghandi, Anshula; Alford, Simon; Dandekar, Sylee; Chin, Peter; Poggio, Tomaso
Current machine learning algorithms are highly specialized to whatever it is they are meant to do –– e.g. playing chess, picking up objects, or object recognition. How can we extend this to a system that could solve a wide range of problems? We argue that this can be achieved by a modular system –– one that can adapt to solving different problems by changing only the modules chosen and the order in which those modules are applied to the problem. The recently introduced ARC (Abstraction and Reasoning Corpus) dataset serves as an excellent test of abstract reasoning. Suited to the modular approach, the tasks depend on a set of human Core Knowledge inbuilt priors. In this paper we implement these priors as the modules of our system. We combine these modules using a neural-guided program synthesis.
2020-11-23T00:00:00Z