Now showing items 4-6 of 155

    • For HyperBFs AGOP is a greedy approximation to gradient descent 

      Gan, Yulu; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2024-07-13)
      The Average Gradient Outer Product (AGOP) provides a novel approach to feature learning in neural networks. We applied both AGOP and Gradient Descent to learn the matrix M in the Hyper Basis Function Network (HyperBF) and ...
    • Compositional Sparsity of Learnable Functions 

      Poggio, Tomaso; Fraser, Maia (Center for Brains, Minds and Machines (CBMM), 2024-02-08)
      Neural networks have demonstrated impressive success in various domains, raising the question of what fundamental principles underlie the effectiveness of the best AI systems and quite possibly of human intelligence. This ...
    • The Janus effects of SGD vs GD: high noise and low rank 

      Xu, Mengjia; Galanti, Tomer; Rangamani, Akshay; Rosasco, Lorenzo; Poggio, Tomaso (2023-12-21)
      It was always obvious that SGD has higher fluctuations at convergence than GD. It has also been often reported that SGD in deep RELU networks has a low-rank bias in the weight matrices. A recent theoretical analysis linked ...