Show simple item record

dc.contributor.authorGlass, James R.
dc.contributor.authorSaenko, Ekaterina
dc.contributor.authorLivescu, Karen
dc.contributor.authorDarrell, Trevor J.
dc.date.accessioned2010-12-14T20:14:09Z
dc.date.available2010-12-14T20:14:09Z
dc.date.issued2009-09
dc.date.submitted2009-01
dc.identifier.issn0162-8828
dc.identifier.otherINSPEC Accession Number: 10773214
dc.identifier.urihttp://hdl.handle.net/1721.1/60293
dc.description.abstractWe study the problem of automatic visual speech recognition (VSR) using dynamic Bayesian network (DBN)-based models consisting of multiple sequences of hidden states, each corresponding to an articulatory feature (AF) such as lip opening (LO) or lip rounding (LR). A bank of discriminative articulatory feature classifiers provides input to the DBN, in the form of either virtual evidence (VE) (scaled likelihoods) or raw classifier margin outputs. We present experiments on two tasks, a medium-vocabulary word-ranking task and a small-vocabulary phrase recognition task. We show that articulatory feature-based models outperform baseline models, and we study several aspects of the models, such as the effects of allowing articulatory asynchrony, of using dictionary-based versus whole-word models, and of incorporating classifier outputs via virtual evidence versus alternative observation models.en_US
dc.description.sponsorshipUnited States. Defense Advanced Research Projects Agencyen_US
dc.description.sponsorshipIndustrial Technology Research Instituteen_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/tpami.2008.303en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceIEEEen_US
dc.titleMultistream Articulatory Feature-Based Models for Visual Speech Recognitionen_US
dc.typeArticleen_US
dc.identifier.citationSaenko, K. et al. “Multistream Articulatory Feature-Based Models for Visual Speech Recognition.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.9 (2009): 1700-1707. ©2009 IEEE.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.approverGlass, James R.
dc.contributor.mitauthorGlass, James R.
dc.contributor.mitauthorSaenko, Ekaterina
dc.relation.journalIEEE Transactions on Pattern Analysis and Machine Intelligenceen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsSaenko, K.; Livescu, K.; Glass, J.; Darrell, T.en
dc.identifier.orcidhttps://orcid.org/0000-0002-3097-360X
mit.licensePUBLISHER_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record