Towards High-Dimensional Generalization in Neural Networks

Boopathy, Akhilan

dc.contributor.advisor	Fiete, Ila
dc.contributor.author	Boopathy, Akhilan
dc.date.accessioned	2026-01-20T19:45:42Z
dc.date.available	2026-01-20T19:45:42Z
dc.date.issued	2025-09
dc.date.submitted	2025-09-15T14:39:33.059Z
dc.identifier.uri	https://hdl.handle.net/1721.1/164566
dc.description.abstract	Neural networks excel in a wide range of applications due to their ability to generalize beyond training data. However, their performance degrades on high-dimensional tasks without large-scale data, a challenge known as the curse of dimensionality. This thesis addresses this limitation by pursuing three key objectives aimed at understanding and improving neural network generalization. 1. We aim to investigate the scaling laws underlying generalization in neural networks including double descent, a phenomenon in which as a model’s capacity or training data is increased, the test error temporarily increases at a certain point before continuing to decrease. In particular, we will have two goals: 1) a better understanding of when double descent can and cannot be empirically observed and 2) a better understanding of scaling laws with respect to training time. 2. Inductive bias refers to the set of assumptions a learning algorithm makes to predict outputs on inputs it has not encountered. We propose quantifying the amount of inductive bias required for a model to generalize well with a fixed amount of training data. By developing methods to measure inductive bias, we can assess how much information model designers need to incorporate into neural networks to improve their generalizability. This quantification can guide the design of harder tasks that better test a model’s generalization. 3. Finally, we aim to develop new methods to enhance neural network generalization, particularly focusing on reducing the exponential number of training samples required for high-dimensional tasks. This involves creating algorithms and architectures that can learn effectively from limited data by incorporating stronger inductive biases. In particular, we will focus on two inductive biases in particular: 1) learning features of the training loss landscape correlated with generalization and 2) using modular neural network architectures. We expect that these techniques can improve generalization, particularly in high-dimensional tasks. Together, these contributions aim to deepen our theoretical understanding and develop practical tools for enabling neural networks to generalize effectively from limited data.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Towards High-Dimensional Generalization in Neural Networks
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: boopathy-akhilan-phd-eecs-2025 ...
Size:: 12.39Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record