dc.contributor.advisor | Deb Roy. | en_US |
dc.contributor.author | Yoshida, Norimasa, 1979- | en_US |
dc.contributor.other | Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. | en_US |
dc.date.accessioned | 2006-03-24T16:19:41Z | |
dc.date.available | 2006-03-24T16:19:41Z | |
dc.date.copyright | 2002 | en_US |
dc.date.issued | 2002 | en_US |
dc.identifier.uri | http://hdl.handle.net/1721.1/29725 | |
dc.description | Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. | en_US |
dc.description | Includes bibliographical references (p. 79-80). | en_US |
dc.description.abstract | As applications incorporating speech recognition technology become widely used, it is desireable to have such systems interact naturally with its users. For such natural interaction to occur, recognition systems must be able to accurately detect when a speaker has finished speaking. This research presents an analysis combining lower and higher level cues to perform the utterance endpointing task. The analysis involves obtaining the optimal parameters for the signal level utterance segmenter, a component of the speech recognition system in the Cognitive Machines Group, and exploring the incorporation of pause duration and grammar information to the utterance segmentation task. As a result, we obtain an optimal set of parameters for the lower level utterance segmenter, and show that part-of-speech based N-gram language modeling of the spoken words in conjunction with pause duration can provide effective signals for utterance endpointing. | en_US |
dc.description.statementofresponsibility | b y Norimasa Yoshida. | en_US |
dc.format.extent | 80 p. | en_US |
dc.format.extent | 2868623 bytes | |
dc.format.extent | 2868427 bytes | |
dc.format.mimetype | application/pdf | |
dc.format.mimetype | application/pdf | |
dc.language.iso | eng | en_US |
dc.publisher | Massachusetts Institute of Technology | en_US |
dc.rights | M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. | en_US |
dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | |
dc.subject | Electrical Engineering and Computer Science. | en_US |
dc.title | Automatic utterance segmentation in spontaneous speech | en_US |
dc.type | Thesis | en_US |
dc.description.degree | M.Eng. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.identifier.oclc | 54038711 | en_US |