Show simple item record

dc.contributor.authorGoldfeld, Ziv
dc.contributor.authorvan den Berg, Ewout
dc.contributor.authorGreenewald, Kristjan
dc.contributor.authorMelnyk, Igor
dc.contributor.authorNguyen, Nam
dc.contributor.authorKinsgbury, Brian
dc.contributor.authorPolyanskiy, Yury
dc.date.accessioned2021-11-05T14:29:09Z
dc.date.available2021-11-05T14:29:09Z
dc.date.issued2019
dc.identifier.urihttps://hdl.handle.net/1721.1/137481
dc.description.abstractCopyright © 2019 ASME We study the estimation of the mutual information I(X;Tℓ) between the input X to a deep neural network (DNN) and the output vector Tℓ of its ℓth hidden layer (an "internal representation"). Focusing on feedforward networks with fixed weights and noisy internal representations, we develop a rigorous framework for accurate estimation of I(X; Tℓ). By relating I(X; Tℓ) to information transmission over additive white Gaussian noise channels, we reveal that compression, i.e. reduction in I(X;Tℓ) over the course of training, is driven by progressive geometric clustering of the representations of samples from the same class. Experimental results verify this connection. Finally, we shift focus to purely deterministic DNNs, where I(X; Tℓ) is provably vacuous, and show that nevertheless, these models also cluster inputs belonging to the same class. The binning-based approximation of I(X; Tℓ) employed in past works to measure compression is identified as a measure of clustering, thus clarifying that these experiments were in fact tracking the same clustering phenomenon. Leveraging the clustering perspective, we provide new evidence that compression and generalization may not be causally related and discuss potential future research ideas.en_US
dc.language.isoen
dc.relation.isversionofhttp://proceedings.mlr.press/v97/goldfeld19a.htmlen_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceProceedings of Machine Learning Researchen_US
dc.titleEstimating Information Flow in Deep Neural Networksen_US
dc.typeArticleen_US
dc.identifier.citationGoldfeld, Ziv, van den Berg, Ewout, Greenewald, Kristjan, Melnyk, Igor, Nguyen, Nam et al. 2019. "Estimating Information Flow in Deep Neural Networks." 36th International Conference on Machine Learning, ICML 2019, 2019-June.
dc.contributor.departmentMIT-IBM Watson AI Laben_US
dc.relation.journal36th International Conference on Machine Learning, ICML 2019en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2021-04-12T17:35:09Z
dspace.orderedauthorsGoldfeld, Z; Van Den Berg, E; Greenewald, K; Melnyk, I; Nguyen, N; Kingsbury, B; Polyanskiy, Yen_US
dspace.date.submission2021-04-12T17:35:11Z
mit.journal.volume2019-Juneen_US
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record