Basic level scene understanding: from labels to structure and beyond

Xiao, Jianxiong; Russell, Bryan C.; Hays, James; Ehinger, Krista A.; Oliva, Aude; Torralba, Antonio

Author(s)

Xiao, Jianxiong; Russell, Bryan C.; Hays, James; Ehinger, Krista A.; Oliva, Aude; ... Show more

DownloadTorralba_Basic level.pdf (31.83Mb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

An early goal of computer vision was to build a system that could automatically understand a 3D scene just by looking. This requires not only the ability to extract 3D information from image information alone, but also to handle the large variety of different environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the SUN database, which is a collection of annotated images spanning 908 different scene categories. This database allows us to systematically study the space of possible everyday scenes and to establish a benchmark for scene and object recognition. We also explore ways of coping with the variety of viewpoints within these scenes. For this, we have introduced a database of 360° panoramic images for many of the scene categories in the SUN database and have explored viewpoint recognition within the environments. Finally, we describe steps toward a unified 3D parsing of everyday scenes: (i) the ability to localize geometric primitives in images, such as cuboids and cylinders, which often comprise many everyday objects, and (ii) an integrated system to extract the 3D structure of the scene and objects depicted in an image.

Date issued

2012-11

URI

http://hdl.handle.net/1721.1/90941

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

SIGGRAPH Asia 2012 Technical Briefs (SA '12)

Publisher

Association for Computing Machinery (ACM)

Citation

Jianxiong Xiao, Bryan C. Russell, James Hays, Krista A. Ehinger, Aude Oliva, and Antonio Torralba. 2012. Basic level scene understanding: from labels to structure and beyond. In SIGGRAPH Asia 2012 Technical Briefs (SA '12). ACM, New York, NY, USA, Article 36, 4 pages.

Version: Author's final manuscript

ISBN

9781450319157

Collections

MIT Open Access Articles

DSpace@MIT