Compositional Sparsity: a framework for ML
Author(s)
Poggio, Tomaso
DownloadCBMM-Memo-138.pdf (984.1Kb)
Additional downloads
Metadata
Show full item recordAbstract
The main claim of this perspective is that compositional sparsity of the target function, which corre- sponds to the task to be learned, is the key principle underlying machine learning. I prove that under restrictions of smoothness of the constituent functions, sparsity of the compositional target functions naturally leads to sparse deep networks for approximation, optimization and generalization. This is the case of most CNNs in current use, in which the known sparse graph of the target function is reflected in the sparse connectivity of the network. When the graph of the target function is unknow, I conjec- ture that transformers are able to implement a flexible version of sparsity (selecting which input tokens interact in the MLP layer), through the self-attention layers.
Surprisingly, the assumption of compositional sparsity of the target function is not restrictive in practice, since for computable functions with Lipschitz continuous derivatives compositional sparsity is equivalent to efficient computability, that is computability in polynomial time.
Date issued
2022-10-10Publisher
Center for Brains, Minds and Machines (CBMM)
Series/Report no.
CBMM Memo;138