A Homogeneous Transformer Architecture
Author(s)
Gan, Yulu; Poggio, Tomaso
DownloadCBMM-Memo-143.pdf (1.067Mb)
Additional downloads
Metadata
Show full item recordAbstract
While the Transformer architecture has made a substantial impact in the field of machine learning, it is unclear what purpose each component serves in the overall architecture. Heterogeneous nonlinear circuits such as multi-layer RELU networks are interleaved with layers of soft-max units. We introduce here a homogeneous architecture based on Hyper Radial Basis Function (HyperBF) units. Evalua- tions on CIFAR10, CIFAR100, and Tiny ImageNet demonstrate a performance comparable to standard vision transformers.
Date issued
2023-09-18Publisher
Center for Brains, Minds and Machines (CBMM)
Series/Report no.
CBMM Memo;143