Robot manipulation with learned representations

Manuelli, Lucas,Ph. D.Massachusetts Institute of Technology.

Author(s)

Manuelli, Lucas,Ph. D.Massachusetts Institute of Technology.

Download1227703710-MIT.pdf (46.81Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Russ Tedrake.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

We would like to have robots which can perform useful manipulation tasks in real-world environments. This requires robots that can perceive the world with both precision and semantic understanding, methods for communicating desired tasks to these systems, and closed loop visual feedback controllers for robustly executing manipulation tasks. This is hard to achieve with previous methods: prior work hasn't enabled robots to densely understand the visual world with sufficient precision to perform robotic manipulation or endowed them with the semantic understanding needed to perform tasks with novel objects. This limitation arises partly from the object representations that have been used, the challenge in extracting these representations from the available sensor data in real-world settings, and the manner in which tasks have been specified. This thesis presents a family of approaches that leverage self-supervision, both in the visual domain and for learning physical dynamics, to enable robots to perform manipulation tasks. Specifically we (i) develop a pipeline to efficiently annotate visual data in cluttered and multi-object environments (ii) demonstrate the novel application of dense visual object descriptors to robotic manipulation and provide a fully self-supervised robot system to acquire them (iii) introduce the concept of category-level manipulation tasks and develop a novel object representation based on semantic 3D keypoints along with a task specification that uses these keypoints to define the task for all objects of a category, including novel instances, (iv) utilize our dense visual object descriptors to quickly learn new manipulation skills through imitation and (v) use our visual object representations to learn data-driven models that can be used to perform closed loop feedback control in manipulation tasks.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020

Cataloged from student-submitted PDF of thesis.

Includes bibliographical references (pages 177-187).

Date issued

2020

URI

https://hdl.handle.net/1721.1/129293

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses