MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Semantic Understanding of Scenes Through the ADE20K Dataset

Author(s)
Zhou, Bolei; Zhao, Hang; Puig Fernandez, Xavier; Xiao, Tete; Fidler, Sanja; Barriuso, Adela; Torralba, Antonio; ... Show more Show less
Thumbnail
DownloadSubmitted version (7.638Mb)
Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
Semantic understanding of visual scenes is one of the holy grails of computer vision. Despite efforts of the community in data collection, there are still few image datasets covering a wide range of scenes and object categories with pixel-wise annotations for scene understanding. In this work, we present a densely annotated dataset ADE20K, which spans diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. Totally there are 25k images of the complex everyday scenes containing a variety of objects in their natural spatial context. On average there are 19.5 instances and 10.5 object classes per image. Based on ADE20K, we construct benchmarks for scene parsing and instance segmentation. We provide baseline performances on both of the benchmarks and re-implement state-of-the-art models for open source. We further evaluate the effect of synchronized batch normalization and find that a reasonably large batch size is crucial for the semantic segmentation performance. We show that the networks trained on ADE20K are able to segment a wide variety of scenes and objects.
Date issued
2018-12
URI
https://hdl.handle.net/1721.1/125771
Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Journal
International Journal of Computer Vision
Publisher
Springer Nature
Citation
Zhou, Bolei, et al. "Semantic Understanding of Scenes Through the ADE20K Dataset." International Journal of Computer Vision 127 (2019): 302–321. https://doi.org/10.1007/s11263-018-1140-0 © 2018 Author(s)
Version: Original manuscript
ISSN
1573-1405
0920-5691

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.