Show simple item record

dc.contributor.advisorFiete, Ila
dc.contributor.authorBoopathy, Akhilan
dc.date.accessioned2026-01-20T19:45:42Z
dc.date.available2026-01-20T19:45:42Z
dc.date.issued2025-09
dc.date.submitted2025-09-15T14:39:33.059Z
dc.identifier.urihttps://hdl.handle.net/1721.1/164566
dc.description.abstractNeural networks excel in a wide range of applications due to their ability to generalize beyond training data. However, their performance degrades on high-dimensional tasks without large-scale data, a challenge known as the curse of dimensionality. This thesis addresses this limitation by pursuing three key objectives aimed at understanding and improving neural network generalization. 1. We aim to investigate the scaling laws underlying generalization in neural networks including double descent, a phenomenon in which as a model’s capacity or training data is increased, the test error temporarily increases at a certain point before continuing to decrease. In particular, we will have two goals: 1) a better understanding of when double descent can and cannot be empirically observed and 2) a better understanding of scaling laws with respect to training time. 2. Inductive bias refers to the set of assumptions a learning algorithm makes to predict outputs on inputs it has not encountered. We propose quantifying the amount of inductive bias required for a model to generalize well with a fixed amount of training data. By developing methods to measure inductive bias, we can assess how much information model designers need to incorporate into neural networks to improve their generalizability. This quantification can guide the design of harder tasks that better test a model’s generalization. 3. Finally, we aim to develop new methods to enhance neural network generalization, particularly focusing on reducing the exponential number of training samples required for high-dimensional tasks. This involves creating algorithms and architectures that can learn effectively from limited data by incorporating stronger inductive biases. In particular, we will focus on two inductive biases in particular: 1) learning features of the training loss landscape correlated with generalization and 2) using modular neural network architectures. We expect that these techniques can improve generalization, particularly in high-dimensional tasks. Together, these contributions aim to deepen our theoretical understanding and develop practical tools for enabling neural networks to generalize effectively from limited data.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleTowards High-Dimensional Generalization in Neural Networks
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record