Classification of vocal fold vibration as regular or irregular in normal, voiced speech

Surana, Kushan Krishna

dc.contributor.advisor	Janet Slifka.	en_US
dc.contributor.author	Surana, Kushan Krishna	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2007-04-03T17:11:57Z
dc.date.available	2007-04-03T17:11:57Z
dc.date.copyright	2006	en_US
dc.date.issued	2006	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/37104
dc.description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.	en_US
dc.description	Includes bibliographical references (p. 91-97).	en_US
dc.description.abstract	Irregular phonation serves an important communicative function in human speech and occurs allophonically in American English. This thesis uses cues from both the temporal and frequency domains - such as fundamental frequency, normalized RMS amplitude, smoothed-energy-difference amplitude (a measure of abruptness in energy variations) and shift-difference amplitude (a measures of periodicity) -to classify segments of regular and irregular phonation in normal, continuous speech. Support Vector Machines (SVMs) are used to classify the tokens as examples of either regular or irregular phonation. The tokens are extracted from the TIMIT database, and are extracted from 151 different speakers. Both genders are well represented, and the tokens occur in various contexts within the utterance. The train-set uses 114 different speakers, while the test-set uses another 37 speakers. A total of 292 of 320 irregular tokens (recognition rate of 91.25% with a false alarm rate of 4.98%), and 4105 of 4320 regular tokens (recognition rate of 95.02% with a false alarm rate of 8.75%) are correctly identified.	en_US
dc.description.abstract	(cont.) The high recognition rates are an indicator that the set of acoustic cues are robust in accurately identifying a token as regular or irregular, even in cases where one or two acoustic cues show unexpected values. Also, analysis of irregular tokens in the training set (1331 irregular tokens) shows that 78% occur at word boundaries and 5% occur at syllable boundaries. Of the irregular tokens at syllable boundaries, 72% are either at the junction of a compound-word (e.g "outcast;") or at the junction of a base word and a suffix. Of the irregular tokens which do not occur at word or syllable boundaries, 70% occur adjacent to voiceless consonants mostly in utterance-final location. These observations support irregular phonation as a cue for syntactic boundaries in connected speech, and combined with the robust classification results to separate regular phonation from irregular phonation, could be used to improve speech recognition and lexical access models.	en_US
dc.description.statementofresponsibility	by Kushan Krishna Surana.	en_US
dc.format.extent	97 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Classification of vocal fold vibration as regular or irregular in normal, voiced speech	en_US
dc.type	Thesis	en_US
dc.description.degree	M.Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	84908823	en_US

Files in this item

Name:: 84908823-MIT.pdf
Size:: 5.191Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record