Show simple item record

dc.contributor.advisorIsola, Phillip
dc.contributor.advisorAgrawal, Pulkit
dc.contributor.authorHuh, Minyoung
dc.date.accessioned2025-03-12T16:54:55Z
dc.date.available2025-03-12T16:54:55Z
dc.date.issued2024-09
dc.date.submitted2025-03-04T18:31:12.821Z
dc.identifier.urihttps://hdl.handle.net/1721.1/158482
dc.description.abstractAt the core of human intelligence lies an insatiable drive to uncover the simple underlying principles that govern the world’s complexities. This quest for parsimony is not unique to biological cognition but also seems to be a fundamental characteristic of artificial intelligence systems. In this thesis, we explore the intrinsic simplicity bias exhibited by deep neural networks — the powerhouse of modern AI. By analyzing the effective rank of the learned representation kernels, we unveil the observation that these models have an inherent preference for learning parsimonious relationships in the data. We provide further experimental results to support the hypothesis that simplicity bias is a good inductive bias for finding generalizing solutions. Building upon this finding, we present the Platonic Representation Hypothesis — the idea that as AI systems continue to grow in capability, they will converge toward not only simple representational kernels but also a common one. This phenomenon is evidenced by the increasing similarity of models across domains, suggesting the existence of a Platonic “ideal” way to represent the world. However, this path to the Platonic representation necessitates scaling up AI models, which poses significant challenges regarding computational demand. To address this obstacle, we conclude the thesis by proposing a framework for training a model with parallel low-rank updates to effectively reach this convergent endpoint.
dc.publisherMassachusetts Institute of Technology
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleParsimonious Principles of Deep Neural Networks
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record