MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

3D Spatial Perception with Real-Time Dense Metric-Semantic SLAM

Author(s)
Rosinol, Antoni
Thumbnail
DownloadThesis PDF (5.315Mb)
Advisor
Carlone, Luca
Leonard, John J.
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
3D Spatial Perception is the ability of an agent to perceive and understand the three-dimensional structure of its environment, including its position and orientation within that environment. This ability is essential for autonomous robots to navigate and interact with their surroundings, since it enables robots to perform a wide variety of tasks, such as obstacle avoidance, path planning, and object manipulation. To provide robots with a detailed and accurate representation of the surrounding environment, this thesis first proposes the use of a map representation that is geometrically dense, photometrically accurate, and semantically annotated. We define these maps as metric-semantic maps, and provide algorithms to build such maps in real-time. Metric-semantic maps allow both humans and robots to have a shared understanding of the scene, while providing the robot with sufficient information to localize, plan shortest paths, and avoid obstacles along the way. We then present a novel 3D representation that abstracts a dense metric-semantic map into higher-level concepts – such as rooms, corridors, and buildings – and also encodes static objects and dynamic entities. We define such representations as 3D Dynamic Scene Graphs (DSGs), and provide as well algorithms to build 3D DSGs. Finally, we show how these approaches can be combined to form a Spatial Perception Engine capable of building both metric-semantic maps and 3D DSGs from visual and inertial data. We also demonstrate the effectiveness of 3D DSGs for fast semantic path-planning queries, which can be used to direct robots using natural language commands. In addition to the algorithms presented in this thesis, we open-source our code and datasets for the research community to use and explore. We believe that the algorithms and resources provided in this thesis open up exciting new possibilities in the field of 3D spatial perception, and we hope to inspire further research in this area, with the ultimate goal of creating fully autonomous robots that are able to navigate and operate in complex environments.
Date issued
2023-02
URI
https://hdl.handle.net/1721.1/150288
Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.