Show simple item record

dc.contributor.advisorNicholas Roy.en_US
dc.contributor.authorGreene, W. Nicholas(William Nicholas)en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Aeronautics and Astronautics.en_US
dc.date.accessioned2021-05-24T20:22:42Z
dc.date.available2021-05-24T20:22:42Z
dc.date.copyright2021en_US
dc.date.issued2021en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/130747
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, February, 2021en_US
dc.descriptionCataloged from the official PDF of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 135-151).en_US
dc.description.abstractMonocular cameras are powerful sensors for a variety of computer vision tasks since they are small, inexpensive, and provide dense perceptual information about the surrounding environment. Efficiently estimating the pose of a moving monocular camera and the 3D structure of the observed scene from the images alone is a fundamental problem in computer vision commonly referred to as monocular simultaneous localization and mapping (SLAM). Given the importance of egomotion estimation and environmental mapping to many applications in robotics and augmented reality, the last twenty years have seen dramatic advances in the state of the art in monocular SLAM. Despite the rapid progress, however, several limitations remain that prevent monocular SLAM systems from transitioning out of the research laboratory and into large, uncontrolled environments on small, resource-constrained computing platforms. This thesis presents research that attempts to address existing problems in monocular SLAM by leveraging different sources of prior information along with targeted applications of machine learning. First, we exploit the piecewise planar structure common in many environments in order to represent the scene using compact triangular meshes that will allow for faster reconstruction and regularization. Second, we leverage the semantic information encoded in large datasets of images to constrain the unobservable scale of motion of the monocular solution to the true, metric scale without additional sensors. Lastly, we compensate for known viewpoint changes when associating pixels between images in order to allow for robust, learning-based depth estimation across disparate views.en_US
dc.description.statementofresponsibilityby W. Nicholas Greene.en_US
dc.format.extent151 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectAeronautics and Astronautics.en_US
dc.titleLeveraging prior information for real-time monocular simultaneous localization and mappingen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Aeronautics and Astronauticsen_US
dc.identifier.oclc1251896691en_US
dc.description.collectionPh.D. Massachusetts Institute of Technology, Department of Aeronautics and Astronauticsen_US
dspace.imported2021-05-24T20:22:42Zen_US
mit.thesis.degreeDoctoralen_US
mit.thesis.departmentAeroen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record