In-Home Daily-Life Captioning Using Radio Signals
Author(s)
Fan, Lijie; Li, Tianhong; Yuan, Yuan; Katabi, Dina
DownloadAccepted version (1.922Mb)
Open Access Policy
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
This paper aims to caption daily life – i.e., to create a textual description of people’s activities and interactions with objects in their homes. Addressing this problem requires novel methods beyond traditional video captioning, as most people would have privacy concerns about deploying cameras throughout their homes. We introduce RF-Diary, a new model for captioning daily life by analyzing the privacy-preserving radio signal in the home with the home’s floormap. RF-Diary can further observe and caption people’s life through walls and occlusions and in dark settings. In designing RF-Diary, we exploit the ability of radio signals to capture people’s 3D dynamics, and use the floormap to help the model learn people’s interactions with objects. We also use a multi-modal feature alignment training scheme that leverages existing video-based captioning datasets to improve the performance of our radio-based captioning model. Extensive experimental results demonstrate that RF-Diary generates accurate captions under visible conditions. It also sustains its good performance in dark or occluded settings, where video-based captioning approaches fail to generate meaningful captions.(For more information, please visit our project webpage: http://rf-diary.csail.mit.edu).
Description
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12347)
Date issued
2020-11Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Lecture Notes in Computer Science
Publisher
Springer International Publishing
Citation
Fan, Lijie et al. "In-Home Daily-Life Captioning Using Radio Signals." ECCV 2020: European Conference on Computer Vision, Lecture Notes in Computer Science, 12347, 105-123. © 2020 Springer Nature Switzerland
Version: Author's final manuscript
ISBN
9783030585358
9783030585365
ISSN
0302-9743
1611-3349