Learning Legged Locomotion by Physics-based Initialization: Motion Imitation from Model-Based Optimal Control

Miller, Adam Joseph

Author(s)

Miller, Adam Joseph

DownloadThesis PDF (49.86Mb)

Advisor

Kim, Sangbae

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

The development of legged robots capable of navigating in and interacting with the world is quickly advancing as new methods and techniques for sensing, decisionmaking, and controls expand the capabilities of state-of-the-art systems. Model-based methods, empowered by greater computing capacity and clever formulations, are imbuing systems with further physics-based understanding. While machine learning techniques, enabled by parallelized data generation and more efficient training, are imparting greater robustness to noise and abilities to handle poorly defined world features. Together these tools constitute the two major paradigms of legged robot research and while both have their shortcomings, they have complementary limitations that can be reinforced by the other’s strengths. We propose MIMOC: Motion Imitation from Model-Based Optimal Control. MIMOC is a Reinforcement Learning (RL) locomotion controller that learns agile locomotion by imitating reference trajectories from model-based optimal control. MIMOC mitigates challenges faced by other motion imitation-based RL approaches because the generated reference trajectories are dynamically consistent, require no motion retargeting, and include torque references that are essential to learn dynamic locomotion. As a result, MIMOC does not require any fine-tuning to transfer the policy to the real robots. MIMOC also overcomes key issues with model-based optimal controllers. Since it is trained with simulated sensor noise and domain randomization, MIMOC is less sensitive to modeling and state estimation inaccuracies. We validate MIMOC on the Mini-Cheetah in outdoor environments over a wide variety of challenging terrain and on the MIT Humanoid in simulation. We show that MIMOC can transfer to the real-world and to different legged platforms. We also show cases where MIMOC outperforms model-based optimal controllers, and demonstrate the value of imitating torque references.

Date issued

2022-09

URI

https://hdl.handle.net/1721.1/147283

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses