Incorporating pitch features for tone modeling in automatic recognition of Mandarin Chinese

Chu, Karen Lingyun

dc.contributor.advisor	Wade Shen and Robert C. Berwick.	en_US
dc.contributor.author	Chu, Karen Lingyun	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2011-02-23T14:41:11Z
dc.date.available	2011-02-23T14:41:11Z
dc.date.copyright	2009	en_US
dc.date.issued	2009	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/61281
dc.description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (p. 53-56).	en_US
dc.description.abstract	Tone plays a fundamental role in Mandarin Chinese, as it plays a lexical role in determining the meanings of words in spoken Mandarin. For example, these two sentences ... (I like horses) and ... (I like to scold) differ only in the tone carried by the last syllable. Thus, the inclusion of tone-related information through analysis of pitch data should improve the performance of automatic speech recognition (ASR) systems on Mandarin Chinese. The focus of this thesis is to improve the performance of a non-tonal automatic speech recognition (ASR) system on a Mandarin Chinese corpus by implementing modifications to the system code to incorporate pitch features. We compile and format a Mandarin Chinese broadcast new corpus for use with the ASR system, and implement a pitch feature extraction algorithm. Additionally, we investigate two algorithms for incorporating pitch features in Mandarin Chinese speech recognition. Firstly, we build and test a baseline tonal ASR system with embedded tone modeling by concatenating the cepstral and pitch feature vectors for use as the input to our phonetic model (a Hidden Markov Model, or HMM). We find that our embedded tone modeling algorithm does improve performance on Mandarin Chinese, showing that including tonal information is in fact contributive for Mandarin Chinese speech recognition. Secondly, we implement and test the effectiveness of HMM-based multistream models.	en_US
dc.description.statementofresponsibility	by Karen Lingyun Chu.	en_US
dc.format.extent	56 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Incorporating pitch features for tone modeling in automatic recognition of Mandarin Chinese	en_US
dc.type	Thesis	en_US
dc.description.degree	M.Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	702369091	en_US

Files in this item

Name:: 702369091-MIT.pdf
Size:: 3.136Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record