Estimating uncertainty models for speech source localization in real-world environments

Wilson, Kevin W. (Kevin William), 1977-

dc.contributor.advisor	Trevor Darrell.	en_US
dc.contributor.author	Wilson, Kevin W. (Kevin William), 1977-	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2007-08-03T15:41:52Z
dc.date.available	2007-08-03T15:41:52Z
dc.date.copyright	2006	en_US
dc.date.issued	2006	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/38227
dc.description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.	en_US
dc.description	Includes bibliographical references (p. 131-140).	en_US
dc.description.abstract	This thesis develops improved solutions to the problems of audio source localization and speech source separation in real reverberant environments. For source localization, it develops a new time- and frequency-dependent weighting function for the generalized cross-correlation framework for time delay estimation. This weighting function is derived from the speech spectrogram as the result of a transformation designed to optimally predict localization cue accuracy. By structuring the problem in this way, we take advantage of the nonstationarity of speech in a way that is similar to the psychoacoustics of the precedence effect. For source separation, we use the same weighting function as part of a simple probabilistic generative model of localization cues. We combine this localization cue model with a mixture model of speech log-spectra and use this combined model to do speech source separation. For both source localization and source separation, we show significantly performance improvements over existing techniques on both real and simulated data in a range of acoustic environments.	en_US
dc.description.statementofresponsibility	by Kevin William Wilson.	en_US
dc.format.extent	140 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Estimating uncertainty models for speech source localization in real-world environments	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph.D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	154236809	en_US

Files in this item

Name:: 154236809-MIT.pdf
Size:: 1.104Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record