Show simple item record

dc.contributor.advisorPatrick H. Winston.en_US
dc.contributor.authorOtašević, Nikola (Nikola B.)en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2014-03-06T15:43:50Z
dc.date.available2014-03-06T15:43:50Z
dc.date.copyright2013en_US
dc.date.issued2013en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/85463
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 95-97).en_US
dc.description.abstractThe human visual system represents a very complex and important part of brain activity, occupying a very significant portion of the cortex resources. It enables us to see colors, detect motion, perceive dimensions and distance. It enables us to solve a very wide range of problems such as image segmentation, object tracking, as well as object and activity recognition. We perform these tasks so easily that we are not even aware of their enormous complexity. How do we do that? This question has motivated decades of research in the field of computer vision. In this thesis, I make a contribution toward solving the particular problem of visionbased human-action recognition by exploiting the compositional nature of simple actions such as running, walking or bending. Noting that simple actions consist of a series of atomic movements and can be represented as a structured sequence of poses, I designed and implemented a system that learns a model of actions based on human-pose classification from a single frame and from a model of transitions between poses through time. The system comprises three parts. The first part is the pose classifier that is capable of inferring a pose from a single frame. Its role is to take as input an image and give its best estimate of the pose in that image. The second part is a hidden Markov model of the transitions between poses. I exploit structural constraints in human motion to build a model that corrects some of the errors made by the independent single-frame pose classifier. Finally, in the third part, the corrected sequence of poses is used to recognize action based on the frequency of pose patterns, the transitions between the poses and hidden Markov models of individual actions. I demonstrate and test my system on the public KTH dataset, which contains examples of running, walking, jogging, boxing, handclapping, and handwaving, as well as on a new dataset, which contains examples of not only running and walking, but also jumping, crouching, crawling, kicking a ball, passing a basketball, and shooting a basketball. On these datasets, my system exhibits 91% action recognition recall rate.en_US
dc.description.statementofresponsibilityby Nikola Otasevic.en_US
dc.format.extent97 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleRecognizing simple human actions by exploiting regularities in pose sequencesen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc870969410en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record