MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Large-Scale Multi-Robot Spatial Perception

Author(s)
Chang, Yun
Thumbnail
DownloadThesis PDF (55.05Mb)
Advisor
Carlone, Luca
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This thesis addresses the challenge of scalable and robust multi-robot spatial perception, with the goal of supporting autonomous task execution in large-scale environments. The work focuses on two core issues: scaling to large, complex environments, and incorporating highlevel scene understanding to enable autonomy for complex tasks. Current multi-robot systems typically focus on geometric reconstruction for navigation, but often fall short in providing the scene understanding needed for complex decision-making and task execution in real-world environments. Conversely, many recent demonstrations of autonomous task execution are limited to small, controlled environments, with few methods addressing scalability to larger scenes. This work bridges this gap by integrating multi-robot simultaneous localization and mapping (SLAM) with spatial perception in order to support downstream autonomy for complex tasks. We begin by introducing methods to enhance the robustness and efficiency of loop closure detection in centralized multi-robot SLAM, focusing on prioritizing loop closures and mitigating the impact of incorrect loop closures in large-scale environments. We then present the first fully distributed metric-semantic SLAM system for multi-robot teams, which supports real-time semantic mapping and enables large-scale deployments with up to 8 robots and 8 kilometers of traversal. To improve reasoning across robot teams, we extend this work to 3D scene graphs, proposing a framework for collaboratively building and maintaining a shared multi-robot scene graph online. Additionally, we introduce algorithms for task-oriented compression of 3D scene graphs to support communication across robots under bandwidth constraints. Finally, we explore open-set scene understanding made possible by advances in visual-language models and highlight the need for task-driven mapping. Building on this, we propose a novel framework for grounding high-level language commands into scene graphs, enabling robots to decompose high-level tasks into executable subtasks while focusing on task-relevant components of the environment. The contributions of this thesis are validated through experimental evaluations in extreme environments and real-world deployments, where multi-robot teams operate in large-scale settings. These experiments tackle a broad range of tasks, from navigation and object search to executing high-level language commands (e.g., “clean the room”). Our contributions advance multi-robot large-scale spatial perception and have the potential to impact real-world applications such as exploration, service robotics, and search and rescue, where autonomous multi-robot teams are essential for performing complex tasks in large environments.
Date issued
2025-09
URI
https://hdl.handle.net/1721.1/165155
Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.