Towards More Generalizable Neural Networks via Modularity

Boopathy, Akhilan

dc.contributor.advisor	Fiete, Ila
dc.contributor.author	Boopathy, Akhilan
dc.date.accessioned	2022-08-29T16:21:38Z
dc.date.available	2022-08-29T16:21:38Z
dc.date.issued	2022-05
dc.date.submitted	2022-06-21T19:25:43.757Z
dc.identifier.uri	https://hdl.handle.net/1721.1/144929
dc.description.abstract	Artificial neural networks have become highly effective at performing specific, challenging tasks by leveraging a large amount of training data. However, they are unable to generalize to diverse, unseen domains without requiring significant retraining. This thesis quantifies the generalization difficulty of a task as the amount of information content in the inductive biases required to solve a task, and demonstrates that generalization difficulty relies crucially on the number of dimensions of generalization. Inspired by the modularity of biological learning systems, this thesis then demonstrates theoretically and empirically that modularity promotes generalization by providing a powerful inductive bias. Finally, the thesis proposes a new challenging spatial navigation benchmark that requires a broad degree of generalization from a small amount of training data. This benchmark is presented as a test of the generalization capability of learning algorithms; based on the results of this thesis, modularity is expected to promote generalization on this benchmark.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Towards More Generalizable Neural Networks via Modularity
dc.type	Thesis
dc.description.degree	S.M.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Science in Electrical Engineering and Computer Science

Files in this item

Name:: Boopathy-akhilan-SM-EECS-2022- ...
Size:: 2.098Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record