Compositional Sparsity: a framework for ML

Poggio, Tomaso

Author(s)

Poggio, Tomaso

DownloadCBMM-Memo-138.pdf (984.1Kb)

Additional downloads

version 4.3 (2/26/2023) (1.130Mb)

Metadata

Show full item record

Abstract

The main claim of this perspective is that compositional sparsity of the target function, which corre- sponds to the task to be learned, is the key principle underlying machine learning. I prove that under restrictions of smoothness of the constituent functions, sparsity of the compositional target functions naturally leads to sparse deep networks for approximation, optimization and generalization. This is the case of most CNNs in current use, in which the known sparse graph of the target function is reflected in the sparse connectivity of the network. When the graph of the target function is unknow, I conjec- ture that transformers are able to implement a flexible version of sparsity (selecting which input tokens interact in the MLP layer), through the self-attention layers. Surprisingly, the assumption of compositional sparsity of the target function is not restrictive in practice, since for computable functions with Lipschitz continuous derivatives compositional sparsity is equivalent to efficient computability, that is computability in polynomial time.

Date issued

2022-10-10

URI

https://hdl.handle.net/1721.1/145776

Publisher

Center for Brains, Minds and Machines (CBMM)

Series/Report no.

CBMM Memo;138

Collections

CBMM Memo Series