Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks

Ananthabhotla, I; Ewert, S; Paradiso, JA

Notice

This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/137109.2

Show simple item record

dc.contributor.author	Ananthabhotla, I
dc.contributor.author	Ewert, S
dc.contributor.author	Paradiso, JA
dc.date.accessioned	2021-11-02T16:57:11Z
dc.date.available	2021-11-02T16:57:11Z
dc.date.issued	2020
dc.identifier.uri	https://hdl.handle.net/1721.1/137109
dc.description.abstract	© 2020 IEEE. A growing need for on-device machine learning has led to an increased interest in light-weight neural networks that lower model complexity while retaining performance. While a variety of general-purpose techniques exist in this context, very few approaches exploit domain-specific properties to further improve upon the capacity-performance trade-off. In this paper, extending our prior work [1], we train a network to emulate the behaviour of an audio codec and use this network to construct a loss. By approximating the psychoacoustic model underlying the codec, our approach enables light-weight neural networks to focus on perceptually relevant properties without wasting their limited capacity on imperceptible signal components. We adapt our method to two audio source separation tasks, demonstrate an improvement in performance for small-scale networks via listening tests, characterize the behaviour of the loss network in detail, and quantify the relationship between performance gain and model capacity. Our work illustrates the potential for incorporating perceptual principles into objective functions for neural networks.	en_US
dc.language.iso	en
dc.publisher	IEEE	en_US
dc.relation.isversionof	10.1109/IJCNN48605.2020.9207053	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks	en_US
dc.type	Article	en_US
dc.identifier.citation	Ananthabhotla, I, Ewert, S and Paradiso, JA. 2020. "Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks." Proceedings of the International Joint Conference on Neural Networks.
dc.relation.journal	Proceedings of the International Joint Conference on Neural Networks	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-06-25T17:55:19Z
dspace.orderedauthors	Ananthabhotla, I; Ewert, S; Paradiso, JA	en_US
dspace.date.submission	2021-06-25T17:55:21Z
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: IJCNN_20.pdf
Size:: 2.655Mb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record

Version	Item	Date	Summary
2	1721.1/137109.2	2021-12-10T21:27:00Z	Verified or entered authority metadata.
1	1721.1/137109*	2021-11-02T16:57:11Z

DSpace@MIT

Notice

Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks

Files in this item

This item appears in the following Collection(s)

Version History