MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Real-time continuous gesture recognition for natural multimodal interaction

Author(s)
Yin, Ying, Ph. D. Massachusetts Institute of Technology
Thumbnail
DownloadFull printable version (27.40Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Randall Davis.
Terms of use
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
I have developed a real-time continuous gesture recognition system capable of dealing with two important problems that have previously been neglected: (a) smoothly handling two different kinds of gestures: those characterized by distinct paths and those characterized by distinct hand poses; and (b) determining how and when the system should respond to gestures. The novel approaches in this thesis include: a probabilistic recognition framework based on a flattened hierarchical hidden Markov model (HHMM) that unifies the recognition of path and pose gestures; and a method of using information from the hidden states in the HMM to identify different gesture phases (the pre-stroke, the nucleus and the post-stroke phases), allowing the system to respond appropriately to both gestures that require a discrete response and those needing a continuous response. The system is extensible: new gestures can be added by recording 3-6 repetitions of the gesture; the system will train an HMM model for the gesture and integrate it into the existing HMM, in a process that takes only a few minutes. Our evaluation shows that even using only a small number of training examples (e.g. 6), the system can achieve an average F1 score of 0.805 for two forms of gestures. To evaluate the performance of my system I collected a new dataset (YANG dataset) that includes both path and pose gestures, offering a combination currently lacking in the community and providing the challenge of recognizing different types of gestures mixed together. I also developed a novel hybrid evaluation metric that is more relevant to real- time interaction with different gesture flows.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
 
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
81
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references (pages 147-154).
 
Date issued
2014
URI
http://hdl.handle.net/1721.1/91036
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.