Publications: Recent submissions
Now showing items 4-6 of 155
-
For HyperBFs AGOP is a greedy approximation to gradient descent
(Center for Brains, Minds and Machines (CBMM), 2024-07-13)The Average Gradient Outer Product (AGOP) provides a novel approach to feature learning in neural networks. We applied both AGOP and Gradient Descent to learn the matrix M in the Hyper Basis Function Network (HyperBF) and ... -
Compositional Sparsity of Learnable Functions
(Center for Brains, Minds and Machines (CBMM), 2024-02-08)Neural networks have demonstrated impressive success in various domains, raising the question of what fundamental principles underlie the effectiveness of the best AI systems and quite possibly of human intelligence. This ... -
The Janus effects of SGD vs GD: high noise and low rank
(2023-12-21)It was always obvious that SGD has higher fluctuations at convergence than GD. It has also been often reported that SGD in deep RELU networks has a low-rank bias in the weight matrices. A recent theoretical analysis linked ...