Following Gaze in Video

Recasens Continente, Adria; Vondrick, Carl Martin; Khosla, Aditya; Torralba, Antonio

dc.contributor.author	Recasens Continente, Adria
dc.contributor.author	Vondrick, Carl Martin
dc.contributor.author	Khosla, Aditya
dc.contributor.author	Torralba, Antonio
dc.date.accessioned	2019-11-06T20:19:42Z
dc.date.available	2019-11-06T20:19:42Z
dc.date.issued	2017-12
dc.identifier.issn	2380-7504
dc.identifier.uri	https://hdl.handle.net/1721.1/122778
dc.description.abstract	Following the gaze of people inside videos is an important signal for understanding people and their actions. In this paper, we present an approach for following gaze in video by predicting where a person (in the video) is looking even when the object is in a different frame. We collect VideoGaze, a new dataset which we use as a benchmark to both train and evaluate models. Given one frame with a person in it, our model estimates a density for gaze location in every frame and the probability that the person is looking in that particular frame. A key aspect of our approach is an end-to-end model that jointly estimates: saliency, gaze pose, and geometric relationships between views while only using gaze as supervision. Visualizations suggest that the model learns to internally solve these intermediate tasks automatically without additional supervision. Experiments show that our approach follows gaze in video better than existing approaches, enabling a richer understanding of human activities in video. Keywords: Motion pictures, Head, Three-dimensional displays, Predictive models, Geometry, Semantics, gaze tracking, learning (artificial intelligence), video signal processing	en_US
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.relation.isversionof	http://dx.doi.org/10.1109/iccv.2017.160	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Following Gaze in Video	en_US
dc.type	Article	en_US
dc.identifier.citation	Recasens Continente, Adria et al. "Following Gaze in Video," 2017 IEEE International Conference on Computer Vision (ICCV), October 2017, Venice, Italy, Institute of Electrical and Electronics Engineers, December 2017 ©IEEE	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.relation.journal	2017 IEEE International Conference on Computer Vision (ICCV)	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2019-07-11T16:25:39Z
dspace.date.submission	2019-07-11T16:25:40Z

Files in this item

Name:: videogazefollow.pdf
Size:: 1.822Mb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record