Show simple item record

dc.contributor.advisorTimothy J. Hazen.en_US
dc.contributor.authorLa, Chia-Hao, 1980-en_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2006-03-24T16:13:35Z
dc.date.available2006-03-24T16:13:35Z
dc.date.copyright2003en_US
dc.date.issued2003en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/29670
dc.descriptionThesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.en_US
dc.descriptionIncludes bibliographical references (p. 51-52).en_US
dc.description.abstractThis thesis describes a method for augmenting an audio-only speech recognizer with visual lip-reading information, in order to improve the performance and robustness of the recognizer. The speech recognizer's variable length audio segments are resolved with the fixed length video frames using segment constrained Hidden Markov Modeling. A Viterbi search over the per-segment Hidden Markov Model resolves the variable asynchrony between the audio and video streams. The two streams are combined according to a relative weighting scheme, which is determined by optimizing on a held-out data set. Although a full audio-visual system has yet not been implemented, this thesis describes the infrastructure that has been developed to accommodate integration with a visual lip-reading module that will be completed in the near future.en_US
dc.description.statementofresponsibilityby Chia-Hao La.en_US
dc.format.extent52 p.en_US
dc.format.extent1862592 bytes
dc.format.extent1862400 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypeapplication/pdf
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleInfrastructure development for integration of lip reading into the SUMMIT Speech Recognizeren_US
dc.typeThesisen_US
dc.description.degreeM.Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc53833510en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record