Show simple item record

dc.contributor.authorLiu, Ziming
dc.contributor.authorGan, Eric
dc.contributor.authorTegmark, Max
dc.date.accessioned2024-01-30T17:26:23Z
dc.date.available2024-01-30T17:26:23Z
dc.date.issued2023-12-30
dc.identifier.urihttps://hdl.handle.net/1721.1/153423
dc.description.abstractWe introduce Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable. Inspired by brains, BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection. This is inspired by the idea of minimum connection cost in evolutionary biology, but we are the first the combine this idea with training neural networks with gradient descent for interpretability. We demonstrate that BIMT discovers useful modular neural networks for many simple tasks, revealing compositional structures in symbolic formulas, interpretable decision boundaries and features for classification, and mathematical structure in algorithmic datasets. Qualitatively, BIMT-trained networks have modules readily identifiable by the naked eye, but regularly trained networks seem much more complicated. Quantitatively, we use Newman’s method to compute the modularity of network graphs; BIMT achieves the highest modularity for all our test problems. A promising and ambitious future direction is to apply the proposed method to understand large models for vision, language, and science.en_US
dc.publisherMultidisciplinary Digital Publishing Instituteen_US
dc.relation.isversionofhttp://dx.doi.org/10.3390/e26010041en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceMultidisciplinary Digital Publishing Instituteen_US
dc.titleSeeing Is Believing: Brain-Inspired Modular Training for Mechanistic Interpretabilityen_US
dc.typeArticleen_US
dc.identifier.citationEntropy 26 (1): 41 (2024)en_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2024-01-26T14:10:58Z
dspace.date.submission2024-01-26T14:10:58Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record