Show simple item record

dc.contributor.authorJeon, Se Hwan
dc.contributor.authorHeim, Steve
dc.contributor.authorKhazoom, Charles
dc.contributor.authorKim, Sangbae
dc.date.accessioned2024-02-28T18:54:47Z
dc.date.available2024-02-28T18:54:47Z
dc.date.issued2023-05-29
dc.identifier.urihttps://hdl.handle.net/1721.1/153602
dc.description2023 IEEE International Conference on Robotics and Automation (ICRA 2023) May 29 - June 2, 2023. London, UKen_US
dc.description.abstractThe main challenge in developing effective reinforcement learning (RL) pipelines is often the design and tuning the reward functions. Well-designed shaping reward can lead to significantly faster learning. Naively formulated rewards, however, can conflict with the desired behavior and result in overfitting or even erratic performance if not properly tuned. In theory, the broad class of potential based reward shaping (PBRS) can help guide the learning process without affecting the optimal policy. Although several studies have explored the use of potential based reward shaping to accelerate learning convergence, most have been limited to grid-worlds and low-dimensional systems, and RL in robotics has predominantly relied on standard forms of reward shaping. In this paper, we benchmark standard forms of shaping with PBRS for a humanoid robot. We find that in this high-dimensional system, PBRS has only marginal benefits in convergence speed. However, the PBRS reward terms are significantly more robust to scaling than typical reward shaping approaches, and thus easier to tune.en_US
dc.language.isoen
dc.publisherIEEEen_US
dc.relation.isversionof10.1109/icra48891.2023.10160885en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearxiven_US
dc.titleBenchmarking Potential Based Rewards for Learning Humanoid Locomotionen_US
dc.typeArticleen_US
dc.identifier.citationJeon, Se Hwan, Heim, Steve, Khazoom, Charles and Kim, Sangbae. 2023. "Benchmarking Potential Based Rewards for Learning Humanoid Locomotion." 2023 IEEE International Conference on Robotics and Automation (ICRA).
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mechanical Engineering
dc.relation.journal2023 IEEE International Conference on Robotics and Automation (ICRA)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2024-02-28T17:09:50Z
dspace.orderedauthorsJeon, SH; Heim, S; Khazoom, C; Kim, Sen_US
dspace.date.submission2024-02-28T17:09:52Z
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record