Show simple item record

dc.contributor.authorXiao, Jianxiong
dc.contributor.authorRussell, Bryan C.
dc.contributor.authorHays, James
dc.contributor.authorEhinger, Krista A.
dc.contributor.authorOliva, Aude
dc.contributor.authorTorralba, Antonio
dc.date.accessioned2014-10-15T13:42:04Z
dc.date.available2014-10-15T13:42:04Z
dc.date.issued2012-11
dc.identifier.isbn9781450319157
dc.identifier.urihttp://hdl.handle.net/1721.1/90941
dc.description.abstractAn early goal of computer vision was to build a system that could automatically understand a 3D scene just by looking. This requires not only the ability to extract 3D information from image information alone, but also to handle the large variety of different environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the SUN database, which is a collection of annotated images spanning 908 different scene categories. This database allows us to systematically study the space of possible everyday scenes and to establish a benchmark for scene and object recognition. We also explore ways of coping with the variety of viewpoints within these scenes. For this, we have introduced a database of 360° panoramic images for many of the scene categories in the SUN database and have explored viewpoint recognition within the environments. Finally, we describe steps toward a unified 3D parsing of everyday scenes: (i) the ability to localize geometric primitives in images, such as cuboids and cylinders, which often comprise many everyday objects, and (ii) an integrated system to extract the 3D structure of the scene and objects depicted in an image.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant 1016862)en_US
dc.description.sponsorshipGoogle (Firm) (Research Award)en_US
dc.description.sponsorshipUnited States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933)en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Career Award 0747120)en_US
dc.language.isoen_US
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/2407746.2407782en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleBasic level scene understanding: from labels to structure and beyonden_US
dc.typeArticleen_US
dc.identifier.citationJianxiong Xiao, Bryan C. Russell, James Hays, Krista A. Ehinger, Aude Oliva, and Antonio Torralba. 2012. Basic level scene understanding: from labels to structure and beyond. In SIGGRAPH Asia 2012 Technical Briefs (SA '12). ACM, New York, NY, USA, Article 36, 4 pages.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorXiao, Jianxiongen_US
dc.contributor.mitauthorEhinger, Krista A.en_US
dc.contributor.mitauthorOliva, Audeen_US
dc.contributor.mitauthorTorralba, Antonioen_US
dc.relation.journalSIGGRAPH Asia 2012 Technical Briefs (SA '12)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsXiao, Jianxiong; Russell, Bryan C.; Hays, James; Ehinger, Krista A.; Oliva, Aude; Torralba, Antonioen_US
dc.identifier.orcidhttps://orcid.org/0000-0003-4915-0256
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record