Show simple item record

dc.contributor.advisorJanet Slifka.en_US
dc.contributor.authorSurana, Kushan Krishnaen_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2007-04-03T17:11:57Z
dc.date.available2007-04-03T17:11:57Z
dc.date.copyright2006en_US
dc.date.issued2006en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/37104
dc.descriptionThesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.en_US
dc.descriptionIncludes bibliographical references (p. 91-97).en_US
dc.description.abstractIrregular phonation serves an important communicative function in human speech and occurs allophonically in American English. This thesis uses cues from both the temporal and frequency domains - such as fundamental frequency, normalized RMS amplitude, smoothed-energy-difference amplitude (a measure of abruptness in energy variations) and shift-difference amplitude (a measures of periodicity) -to classify segments of regular and irregular phonation in normal, continuous speech. Support Vector Machines (SVMs) are used to classify the tokens as examples of either regular or irregular phonation. The tokens are extracted from the TIMIT database, and are extracted from 151 different speakers. Both genders are well represented, and the tokens occur in various contexts within the utterance. The train-set uses 114 different speakers, while the test-set uses another 37 speakers. A total of 292 of 320 irregular tokens (recognition rate of 91.25% with a false alarm rate of 4.98%), and 4105 of 4320 regular tokens (recognition rate of 95.02% with a false alarm rate of 8.75%) are correctly identified.en_US
dc.description.abstract(cont.) The high recognition rates are an indicator that the set of acoustic cues are robust in accurately identifying a token as regular or irregular, even in cases where one or two acoustic cues show unexpected values. Also, analysis of irregular tokens in the training set (1331 irregular tokens) shows that 78% occur at word boundaries and 5% occur at syllable boundaries. Of the irregular tokens at syllable boundaries, 72% are either at the junction of a compound-word (e.g "outcast;") or at the junction of a base word and a suffix. Of the irregular tokens which do not occur at word or syllable boundaries, 70% occur adjacent to voiceless consonants mostly in utterance-final location. These observations support irregular phonation as a cue for syntactic boundaries in connected speech, and combined with the robust classification results to separate regular phonation from irregular phonation, could be used to improve speech recognition and lexical access models.en_US
dc.description.statementofresponsibilityby Kushan Krishna Surana.en_US
dc.format.extent97 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleClassification of vocal fold vibration as regular or irregular in normal, voiced speechen_US
dc.typeThesisen_US
dc.description.degreeM.Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc84908823en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record