Learning Legged Locomotion by Physics-based Initialization: Motion Imitation from Model-Based Optimal Control
Author(s)
Miller, Adam Joseph
DownloadThesis PDF (49.86Mb)
Advisor
Kim, Sangbae
Terms of use
Metadata
Show full item recordAbstract
The development of legged robots capable of navigating in and interacting with the world is quickly advancing as new methods and techniques for sensing, decisionmaking, and controls expand the capabilities of state-of-the-art systems. Model-based methods, empowered by greater computing capacity and clever formulations, are imbuing systems with further physics-based understanding. While machine learning techniques, enabled by parallelized data generation and more efficient training, are imparting greater robustness to noise and abilities to handle poorly defined world features. Together these tools constitute the two major paradigms of legged robot research and while both have their shortcomings, they have complementary limitations that can be reinforced by the other’s strengths.
We propose MIMOC: Motion Imitation from Model-Based Optimal Control. MIMOC is a Reinforcement Learning (RL) locomotion controller that learns agile locomotion by imitating reference trajectories from model-based optimal control. MIMOC mitigates challenges faced by other motion imitation-based RL approaches because the generated reference trajectories are dynamically consistent, require no motion retargeting, and include torque references that are essential to learn dynamic locomotion. As a result, MIMOC does not require any fine-tuning to transfer the policy to the real robots. MIMOC also overcomes key issues with model-based optimal controllers. Since it is trained with simulated sensor noise and domain randomization, MIMOC is less sensitive to modeling and state estimation inaccuracies. We validate MIMOC on the Mini-Cheetah in outdoor environments over a wide variety of challenging terrain and on the MIT Humanoid in simulation. We show that MIMOC can transfer to the real-world and to different legged platforms. We also show cases where MIMOC outperforms model-based optimal controllers, and demonstrate the value of imitating torque references.
Date issued
2022-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology