| dc.contributor.advisor | Rus, Daniela | |
| dc.contributor.author | Quach, Alex H. | |
| dc.date.accessioned | 2024-09-03T21:08:18Z | |
| dc.date.available | 2024-09-03T21:08:18Z | |
| dc.date.issued | 2024-05 | |
| dc.date.submitted | 2024-07-11T14:36:25.550Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/156571 | |
| dc.description.abstract | Achieving generalization for autonomous robotic systems operating in real-world environments remains a significant challenge. Training robots solely in simulations can be limiting due to the "sim-to-real gap"– discrepancies between simulated and real-world conditions. We present two novel approaches to enhance the generalization capabilities of autonomous quadrotor navigation systems when transferring from simulation to the real world. Our f irst approach integrates a 3D Gaussian Splatting radiance field with a quadrotor flight dynamics engine to generate high-quality, photorealistic training data. We design imitation learning schemes to train liquid time-constant neural networks on this data. Through rigorous evaluations, we demonstrate successful zero-shot transfer of the learned navigation policies from simulation to real-world flight, exhibiting generalization to complex, multi-step tasks in novel indoor and outdoor environments. Notably, we showcase autonomous quadrotor policies trained entirely in simulation that can be directly deployed in the real world without fine-tuning. Our method leverages the complementary strengths of photorealistic rendering and irregularly time-sampled data augmentation for enhancing generalization with liquid neural networks. Additionally, we compose off-the-shelf vision-and-language models with neural policies, enabling real-world generalization to complex objects and instructions unseen during training. To the best of our knowledge, this is the first report of zero-shot sim-to-real transfer and semantic generalization for autonomous quadrotor navigation using imitation learning. Our key contributions include: (1) a dynamics-augmented Gaussian splatting simulator, (2) implicit closed-loop augmentation via expert trajectory design, (3) robustifying liquid neural networks through irregularly sampled data, (4) extensive simulation and real-world validation, (5) demonstrating zero-shot real-world transfer capabilities, and (6) enabling zero-shot instruction generalization to novel objects using multimodal representations. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.title | Robust Scene and Object Generalization of Neural Policies Trained in Synthetic Environments | |
| dc.type | Thesis | |
| dc.description.degree | M.Eng. | |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| mit.thesis.degree | Master | |
| thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |