An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition

Raj, Bhiksha; Turicchia, Lorenzo; Schmidt-Nielsen, Bent; Sarpeshkar, Rahul

dc.contributor.author	Raj, Bhiksha
dc.contributor.author	Turicchia, Lorenzo
dc.contributor.author	Schmidt-Nielsen, Bent
dc.contributor.author	Sarpeshkar, Rahul
dc.date.accessioned	2011-11-16T13:45:46Z
dc.date.available	2011-11-16T13:45:46Z
dc.date.issued	2007-06
dc.date.submitted	2006-11
dc.identifier.issn	1687-4714
dc.identifier.issn	1687-4722
dc.identifier.uri	http://hdl.handle.net/1721.1/67033
dc.description.abstract	We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +5 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations.	en_US
dc.publisher	Hindawi Publishing Corporation	en_US
dc.relation.isversionof	http://dx.doi.org/10.1155/2007/65420	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	http://creativecommons.org/licenses/by/2.0	en_US
dc.title	An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition	en_US
dc.type	Article	en_US
dc.identifier.citation	EURASIP Journal on Audio, Speech, and Music Processing. 2007 Jun 26;2007(1):065420	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.department	Massachusetts Institute of Technology. Research Laboratory of Electronics	en_US
dc.contributor.approver	Turicchia, Lorenzo
dc.contributor.mitauthor	Turicchia, Lorenzo
dc.contributor.mitauthor	Sarpeshkar, Rahul
dc.relation.journal	EURASIP Journal on Audio, Speech, and Music Processing	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2011-09-23T17:09:42Z
dc.language.rfc3066	en
dc.rights.holder	et al.; licensee BioMed Central Ltd.
dspace.orderedauthors	Raj, Bhiksha; Turicchia, Lorenzo; Schmidt-Nielsen, Bent; Sarpeshkar, Rahul	en
dc.identifier.orcid	https://orcid.org/0000-0003-0384-3786
mit.license	PUBLISHER_CC	en_US
mit.metadata.status	Complete

Files in this item

Name:: 1687-4722-2007-065420.pdf
Size:: 1.273Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record