Notice
This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/138366.2
Perspective Plane Program Induction From a Single Image
Author(s)
Li, Yikai; Mao, Jiayuan; Zhang, Xiuming; Freeman, William T; Tenenbaum, Joshua B; Wu, Jiajun; ... Show more Show less
DownloadAccepted version (7.572Mb)
Open Access Policy
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
© 2020 IEEE. We study the inverse graphics problem of inferring a holistic representation for natural images. Given an input image, our goal is to induce a neuro-symbolic, program-like representation that jointly models camera poses, object locations, and global scene structures. Such high-level, holistic scene representations further facilitate low-level image manipulation tasks such as inpainting. We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image. The benefits of such joint inference are two-fold: scene regularity serves as a new cue for perspective correction, and in turn, correct perspective correction leads to a simplified scene structure, similar to how the correct shape leads to the most regular texture in shape from texture. Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and gradient-based algorithms to efficiently solve the problem. P3I outperforms a set of baselines on a collection of Internet images, across tasks including camera pose estimation, global structure inference, and down-stream image manipulation tasks.
Date issued
2020Journal
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Li, Yikai, Mao, Jiayuan, Zhang, Xiuming, Freeman, William T, Tenenbaum, Joshua B et al. 2020. "Perspective Plane Program Induction From a Single Image." Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
Version: Author's final manuscript