Notice
This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/137109.2
Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks
| dc.contributor.author | Ananthabhotla, I | |
| dc.contributor.author | Ewert, S | |
| dc.contributor.author | Paradiso, JA | |
| dc.date.accessioned | 2021-11-02T16:57:11Z | |
| dc.date.available | 2021-11-02T16:57:11Z | |
| dc.date.issued | 2020 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/137109 | |
| dc.description.abstract | © 2020 IEEE. A growing need for on-device machine learning has led to an increased interest in light-weight neural networks that lower model complexity while retaining performance. While a variety of general-purpose techniques exist in this context, very few approaches exploit domain-specific properties to further improve upon the capacity-performance trade-off. In this paper, extending our prior work [1], we train a network to emulate the behaviour of an audio codec and use this network to construct a loss. By approximating the psychoacoustic model underlying the codec, our approach enables light-weight neural networks to focus on perceptually relevant properties without wasting their limited capacity on imperceptible signal components. We adapt our method to two audio source separation tasks, demonstrate an improvement in performance for small-scale networks via listening tests, characterize the behaviour of the loss network in detail, and quantify the relationship between performance gain and model capacity. Our work illustrates the potential for incorporating perceptual principles into objective functions for neural networks. | en_US | 
| dc.language.iso | en | |
| dc.publisher | IEEE | en_US | 
| dc.relation.isversionof | 10.1109/IJCNN48605.2020.9207053 | en_US | 
| dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US | 
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US | 
| dc.source | MIT web domain | en_US | 
| dc.title | Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks | en_US | 
| dc.type | Article | en_US | 
| dc.identifier.citation | Ananthabhotla, I, Ewert, S and Paradiso, JA. 2020. "Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks." Proceedings of the International Joint Conference on Neural Networks. | |
| dc.relation.journal | Proceedings of the International Joint Conference on Neural Networks | en_US | 
| dc.eprint.version | Author's final manuscript | en_US | 
| dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US | 
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US | 
| dc.date.updated | 2021-06-25T17:55:19Z | |
| dspace.orderedauthors | Ananthabhotla, I; Ewert, S; Paradiso, JA | en_US | 
| dspace.date.submission | 2021-06-25T17:55:21Z | |
| mit.license | OPEN_ACCESS_POLICY | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US | 
