Show simple item record

dc.contributor.authorPirsiavash, Hamed
dc.contributor.authorVondrick, Carl Martin
dc.contributor.authorOktay, Deniz
dc.contributor.authorTorralba, Antonio
dc.date.accessioned2018-01-12T18:47:34Z
dc.date.available2018-01-12T18:47:34Z
dc.date.issued2016-12
dc.date.submitted2016-06
dc.identifier.isbn978-1-4673-8851-1
dc.identifier.urihttp://hdl.handle.net/1721.1/113091
dc.description.abstractUnderstanding human actions is a key problem in computer vision. However, recognizing actions is only the first step of understanding what a person is doing. In this paper, we introduce the problem of predicting why a person has performed an action in images. This problem has many applications in human activity understanding, such as anticipating or explaining an action. To study this problem, we introduce a new dataset of people performing actions annotated with likely motivations. However, the information in an image alone may not be sufficient to automatically solve this task. Since humans can rely on their lifetime of experiences to infer motivation, we propose to give computer vision systems access to some of these experiences by using recently developed natural language models to mine knowledge stored in massive amounts of text. While we are still far away from fully understanding motivation, our results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant IIS-1524817)en_US
dc.description.sponsorshipGoogle (Firm) (Faculty Research Award)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/CVPR.2016.327en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceOther repositoryen_US
dc.titlePredicting Motivations of Actions by Leveraging Texten_US
dc.typeArticleen_US
dc.identifier.citationVondrick, Carl, Deniz Oktay, Hamed Pirsiavash, and Antonio Torralba. “Predicting Motivations of Actions by Leveraging Text.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27-30 June 2016, Las Vegas, Nevada, IEEE, 2016.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorVondrick, Carl Martin
dc.contributor.mitauthorOktay, Deniz
dc.contributor.mitauthorTorralba, Antonio
dc.relation.journal2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsVondrick, Carl; Oktay, Deniz; Pirsiavash, Hamed; Torralba, Antonioen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0001-5676-2387
dc.identifier.orcidhttps://orcid.org/0000-0003-4915-0256
mit.licenseOPEN_ACCESS_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record