MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Rank2Reward: Learning Robot Reward Functions from Passive Video

Author(s)
Yang, Daniel Xin
Thumbnail
DownloadThesis PDF (64.30Mb)
Advisor
Agrawal, Pulkit
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Teaching robots novel skills with demonstrations via human-in-the-loop data collection techniques like kinesthetic teaching or teleoperation is a promising approach, but puts a heavy burden of data collection on human supervisors as well as instrumentation for inferring states and actions. In contrast to this paradigm, it is often significantly easy to be provided visual data of tasks being performed. Ideally, this data can serve to guide robot learning for new tasks in novel environments, informing both what to do and how to do it. A powerful way to encoder both what to do and how to do it in the absence of low-level states and actions is by inferring a well-shaped reward function for reinforcement learning. The challenging problem is determining how to ground visual demonstration inputs into a well-shaped and informative reward function for reinforcement learning. To this end, we propose a technique, Rank2Reward, for learning behaviors from videos of tasks being performed, without access to any low-level states and actions. We do so by leveraging the videos to learn a reward function that measures incremental “progress" through a task by learning how to rank the video frames in a demonstration in order. By inferring an appropriate ranking, the reward function is able to quickly indicate when task progress is being made, guiding reinforcement learning to quickly learn the task in new scenarios. We demonstrate the effectiveness of this simple technique at learning behaviors directly from raw video on a number of tasks in simulation as well as several tasks on a real world robotic arm.
Date issued
2023-06
URI
https://hdl.handle.net/1721.1/151463
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.