Gaze Prediction in First-Person View Videos
Author(s)
Zhou, Diane Yue.
Download1227276695-MIT.pdf (37.42Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Aude Oliva and Mathew Monfort.
Terms of use
Metadata
Show full item recordAbstract
Gaze is an important topic in computer vision as it reveals points of interest that tend to capture a subject's attention in a scene and potential intentions of the subject of gaze. Gaze data is becoming more readily obtainable with technological advances in wearable cameras, enabling the potential for more accurate first-person view gaze prediction models and interesting analyses of gaze. In this research, we use gaze data collected from Pupil Labs glasses to build and compare several gaze prediction models. Our models predict the location of gaze in each frame of a first-person view video by leveraging convolutional neural networks based solely on visual saliency maps. We believe that future work in incorporating more context information about the camera wearer's behavior and the scenes in the videos would further improve the model's performance.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020 Cataloged from student-submitted PDF of thesis. Includes bibliographical references (pages ).
Date issued
2020Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.