MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Predicting Motivations of Actions by Leveraging Text

Author(s)
Pirsiavash, Hamed; Vondrick, Carl Martin; Oktay, Deniz; Torralba, Antonio
Thumbnail
DownloadTorralba_Predicting motivations.pdf (11.89Mb)
OPEN_ACCESS_POLICY

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
Understanding human actions is a key problem in computer vision. However, recognizing actions is only the first step of understanding what a person is doing. In this paper, we introduce the problem of predicting why a person has performed an action in images. This problem has many applications in human activity understanding, such as anticipating or explaining an action. To study this problem, we introduce a new dataset of people performing actions annotated with likely motivations. However, the information in an image alone may not be sufficient to automatically solve this task. Since humans can rely on their lifetime of experiences to infer motivation, we propose to give computer vision systems access to some of these experiences by using recently developed natural language models to mine knowledge stored in massive amounts of text. While we are still far away from fully understanding motivation, our results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.
Date issued
2016-12
URI
http://hdl.handle.net/1721.1/113091
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Vondrick, Carl, Deniz Oktay, Hamed Pirsiavash, and Antonio Torralba. “Predicting Motivations of Actions by Leveraging Text.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27-30 June 2016, Las Vegas, Nevada, IEEE, 2016.
Version: Author's final manuscript
ISBN
978-1-4673-8851-1

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.