MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Leveraging prior information for real-time monocular simultaneous localization and mapping

Author(s)
Greene, W. Nicholas(William Nicholas)
Thumbnail
Download1251896691-MIT.pdf (15.58Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics.
Advisor
Nicholas Roy.
Terms of use
MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
Monocular cameras are powerful sensors for a variety of computer vision tasks since they are small, inexpensive, and provide dense perceptual information about the surrounding environment. Efficiently estimating the pose of a moving monocular camera and the 3D structure of the observed scene from the images alone is a fundamental problem in computer vision commonly referred to as monocular simultaneous localization and mapping (SLAM). Given the importance of egomotion estimation and environmental mapping to many applications in robotics and augmented reality, the last twenty years have seen dramatic advances in the state of the art in monocular SLAM. Despite the rapid progress, however, several limitations remain that prevent monocular SLAM systems from transitioning out of the research laboratory and into large, uncontrolled environments on small, resource-constrained computing platforms. This thesis presents research that attempts to address existing problems in monocular SLAM by leveraging different sources of prior information along with targeted applications of machine learning. First, we exploit the piecewise planar structure common in many environments in order to represent the scene using compact triangular meshes that will allow for faster reconstruction and regularization. Second, we leverage the semantic information encoded in large datasets of images to constrain the unobservable scale of motion of the monocular solution to the true, metric scale without additional sensors. Lastly, we compensate for known viewpoint changes when associating pixels between images in order to allow for robust, learning-based depth estimation across disparate views.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, February, 2021
 
Cataloged from the official PDF of thesis.
 
Includes bibliographical references (pages 135-151).
 
Date issued
2021
URI
https://hdl.handle.net/1721.1/130747
Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Publisher
Massachusetts Institute of Technology
Keywords
Aeronautics and Astronautics.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.