Show simple item record

dc.contributor.advisorRus, Daniela
dc.contributor.authorQuach, Alex H.
dc.date.accessioned2024-09-03T21:08:18Z
dc.date.available2024-09-03T21:08:18Z
dc.date.issued2024-05
dc.date.submitted2024-07-11T14:36:25.550Z
dc.identifier.urihttps://hdl.handle.net/1721.1/156571
dc.description.abstractAchieving generalization for autonomous robotic systems operating in real-world environments remains a significant challenge. Training robots solely in simulations can be limiting due to the "sim-to-real gap"– discrepancies between simulated and real-world conditions. We present two novel approaches to enhance the generalization capabilities of autonomous quadrotor navigation systems when transferring from simulation to the real world. Our f irst approach integrates a 3D Gaussian Splatting radiance field with a quadrotor flight dynamics engine to generate high-quality, photorealistic training data. We design imitation learning schemes to train liquid time-constant neural networks on this data. Through rigorous evaluations, we demonstrate successful zero-shot transfer of the learned navigation policies from simulation to real-world flight, exhibiting generalization to complex, multi-step tasks in novel indoor and outdoor environments. Notably, we showcase autonomous quadrotor policies trained entirely in simulation that can be directly deployed in the real world without fine-tuning. Our method leverages the complementary strengths of photorealistic rendering and irregularly time-sampled data augmentation for enhancing generalization with liquid neural networks. Additionally, we compose off-the-shelf vision-and-language models with neural policies, enabling real-world generalization to complex objects and instructions unseen during training. To the best of our knowledge, this is the first report of zero-shot sim-to-real transfer and semantic generalization for autonomous quadrotor navigation using imitation learning. Our key contributions include: (1) a dynamics-augmented Gaussian splatting simulator, (2) implicit closed-loop augmentation via expert trajectory design, (3) robustifying liquid neural networks through irregularly sampled data, (4) extensive simulation and real-world validation, (5) demonstrating zero-shot real-world transfer capabilities, and (6) enabling zero-shot instruction generalization to novel objects using multimodal representations.
dc.publisherMassachusetts Institute of Technology
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleRobust Scene and Object Generalization of Neural Policies Trained in Synthetic Environments
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record