MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Scene parsing through ADE20K dataset

Author(s)
Zhou, Bolei; Zhao, Hang; Puig Fernandez, Francesco Xavier; Fidler, Sanja; Barriuso, Adela; Torralba, Antonio; ... Show more Show less
Thumbnail
DownloadAccepted version (4.649Mb)
Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the community's efforts in data collection, there are still few image datasets covering a wide range of scenes and object categories with dense and detailed annotations for scene parsing. In this paper, we introduce and analyze the ADE20K dataset, spanning diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. A scene parsing benchmark is built upon the ADE20K with 150 object and stuff classes included. Several segmentation baseline models are evaluated on the benchmark. A novel network design called Cascade Segmentation Module is proposed to parse a scene into stuff, objects, and object parts in a cascade and improve over the baselines. We further show that the trained scene parsing networks can lead to applications such as image content removal and scene synthesis. ©2017 Paper presented at the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), July 21-26, 2017, Honolulu, Hawaii.
Date issued
2017
URI
https://hdl.handle.net/1721.1/124982
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Journal
Proceedings, 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Zhou, Bolei, et al., "Scene parsing through ADE20K dataset." Proceedings, 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Piscataway, N.J.: IEEE, 2017): p. 5122-30 doi 10.1109/CVPR.2017.544 ©2017 Author(s)
Version: Author's final manuscript
ISBN
978-1-5386-0457-1

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.