Show simple item record

dc.contributor.authorZhang, Chiyuan
dc.contributor.authorEvangelopoulos, Georgios
dc.contributor.authorVoinea, Stephen Constantin
dc.contributor.authorRosasco, Lorenzo Andrea
dc.contributor.authorPoggio, Tomaso A.
dc.date.accessioned2016-05-13T18:51:27Z
dc.date.available2016-05-13T18:51:27Z
dc.date.issued2014-05
dc.identifier.isbn978-1-4799-2893-4
dc.identifier.issn1520-6149
dc.identifier.urihttp://hdl.handle.net/1721.1/102485
dc.description.abstractRepresentations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this paper we propose the use of such computational modules for extracting invariant and discriminative audio representations. Building on a theory of invariance in hierarchical architectures, we propose a novel, mid-level representation for acoustical signals, using the empirical distributions of projections on a set of templates and their transformations. Under the assumption that, by construction, this dictionary of templates is composed from similar classes, and samples the orbit of variance-inducing signal transformations (such as shift and scale), the resulting signature is theoretically guaranteed to be unique, invariant to transformations and stable to deformations. Modules of projection and pooling can then constitute layers of deep networks, for learning composite representations. We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (STC Center for Brains, Minds and Machines Award CCF-1231216)en_US
dc.description.sponsorshipItalian Ministry of Education (University and Research FIRB Project RBFR12M3AC)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/ICASSP.2014.6854954en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleA deep representation for invariance and music classificationen_US
dc.typeArticleen_US
dc.identifier.citationZhang, Chiyuan, Georgios Evangelopoulos, Stephen Voinea, Lorenzo Rosasco, and Tomaso Poggio. “A Deep Representation for Invariance and Music Classification.” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (May 2014).en_US
dc.contributor.departmentCenter for Brains, Minds, and Machinesen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Brain and Cognitive Sciencesen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMcGovern Institute for Brain Research at MITen_US
dc.contributor.mitauthorZhang, Chiyuanen_US
dc.contributor.mitauthorEvangelopoulos, Georgiosen_US
dc.contributor.mitauthorVoinea, Stephen Constantinen_US
dc.contributor.mitauthorRosasco, Lorenzo Andreaen_US
dc.contributor.mitauthorPoggio, Tomaso A.en_US
dc.relation.journalProceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsZhang, Chiyuan; Evangelopoulos, Georgios; Voinea, Stephen; Rosasco, Lorenzo; Poggio, Tomasoen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0001-8467-1888
dc.identifier.orcidhttps://orcid.org/0000-0002-3944-0455
dc.identifier.orcidhttps://orcid.org/0000-0001-6376-4786
dc.identifier.orcidhttps://orcid.org/0000-0003-2240-1801
dc.identifier.orcidhttps://orcid.org/0000-0002-5727-9941
mit.licenseOPEN_ACCESS_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record