Parsimonious Principles of Deep Neural Networks

Huh, Minyoung

dc.contributor.advisor	Isola, Phillip
dc.contributor.advisor	Agrawal, Pulkit
dc.contributor.author	Huh, Minyoung
dc.date.accessioned	2025-03-12T16:54:55Z
dc.date.available	2025-03-12T16:54:55Z
dc.date.issued	2024-09
dc.date.submitted	2025-03-04T18:31:12.821Z
dc.identifier.uri	https://hdl.handle.net/1721.1/158482
dc.description.abstract	At the core of human intelligence lies an insatiable drive to uncover the simple underlying principles that govern the world’s complexities. This quest for parsimony is not unique to biological cognition but also seems to be a fundamental characteristic of artificial intelligence systems. In this thesis, we explore the intrinsic simplicity bias exhibited by deep neural networks — the powerhouse of modern AI. By analyzing the effective rank of the learned representation kernels, we unveil the observation that these models have an inherent preference for learning parsimonious relationships in the data. We provide further experimental results to support the hypothesis that simplicity bias is a good inductive bias for finding generalizing solutions. Building upon this finding, we present the Platonic Representation Hypothesis — the idea that as AI systems continue to grow in capability, they will converge toward not only simple representational kernels but also a common one. This phenomenon is evidenced by the increasing similarity of models across domains, suggesting the existence of a Platonic “ideal” way to represent the world. However, this path to the Platonic representation necessitates scaling up AI models, which poses significant challenges regarding computational demand. To address this obstacle, we conclude the thesis by proposing a framework for training a model with parallel low-rank updates to effectively reach this convergent endpoint.
dc.publisher	Massachusetts Institute of Technology
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	Parsimonious Principles of Deep Neural Networks
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: huh-minhuh-phd-eecs-2024-thesis.pdf
Size:: 34.17Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record