Applications of missing feature theory to speaker recognition

Padilla, Michael Thomas, 1974-

dc.contributor.advisor	Thomas F. Quatieri.	en_US
dc.contributor.author	Padilla, Michael Thomas, 1974-	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2011-11-18T20:54:50Z
dc.date.available	2011-11-18T20:54:50Z
dc.date.copyright	2000	en_US
dc.date.issued	2000	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/67165
dc.description	Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.	en_US
dc.description	Includes bibliographical references (p. 100-101).	en_US
dc.description.abstract	An important problem in speaker recognition is the degradation that occurs when speaker models trained with speech from one type of channel are used to score speech from another type of channel, known as channel mismatch. This thesis investigates various channel compensation techniques and approaches from missing feature theory for improving Gaussian mixture model (GMM)-based speaker verification under this mismatch condition. Experiments are performed using a speech corpus consisting of "clean" training speech and "dirty" test speech equal to the clean speech corrupted by additive Gaussian noise. Channel compensation methods studied are cepstral mean subtraction, RASTA, and spectral subtraction. Approaches to missing feature theory include missing feature compensation, which removes corrupted features, and missing feature restoration which predicts such features from neighboring features in both frequency and time. These methods are investigated both individually and in combination. In particular, missing feature compensation combined with spectral subtraction in the discrete Fourier transform domain significantly improves GMM speaker verification accuracy and outperforms all other methods examined in this thesis, reducing the equal error rate by about 10% more than other methods over a SNR range of 5-25 dB. Moreover, this considerably outperforms a state-of-the-art GMM recognizer for the mismatch application that combines missing feature theory with spectral subtraction developed in a mel-filter energy domain. Finally, the concept of missing restoration is explored. A novel linear minimum mean-squared-error missing feature estimator is derived and applied to pure vowels as well as a clean/dirty verification trial. While it does not improve performance in the verification trial, a large SNR improvement for features estimated for the pure vowel case indicate promise in the application of this method.	en_US
dc.description.statementofresponsibility	by Michael Thomas Padilla.	en_US
dc.format.extent	101 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Applications of missing feature theory to speaker recognition	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	45170069	en_US

Files in this item

Name:: 45170069-MIT.pdf
Size:: 6.564Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record