MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Irreversible Actions in Assistance Games with a Dynamic Goal

Author(s)
Mayer, Hendrik T.
Thumbnail
DownloadThesis PDF (652.3Kb)
Advisor
Hadfield-Menell, Dylan
Terms of use
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/
Metadata
Show full item record
Abstract
Reinforcement Learning (RL) agents optimize reward functions to learn desirable policies in a variety of important real-world applications such as self-driving cars and recommender systems. However, in practice, it can be very difficult to specify the correct reward function for a complex problem, in what is known as reward misspecifcation. Impact measures provide metrics to determine how robust a particular agent’s behavior is to reward misspecification. This thesis analyzes one particular impact measure: the frequency of irreversible actions that an agent takes. We study this impact measure using a time-varying model of the principal’s preferences. This choice was motivated by two primary considerations. First, many real-world scenarios consist of a principal with time-varying preferences. Second, an agent assuming time-varying preferences may be more averse to performing irreversible actions. In this thesis, we examine principal-agent (human-robot) assistance games in toy grid environments inspired by cooperative inverse reinforcement learning [1], where irreversible actions correspond to removing transitions from a POMDP. In these games, we focus on how the frequency of changes in the principal’s preferences and the optimality of the principal influence the agent’s willingness to take irreversible actions. In 2-node and 4-node assistance games, we find two main results. First, in the presence of a random or approximately optimal human, the robot performs more irreversible actions as the goal state changes position more often. Second, in the presence of an optimal human, the robot rarely performs irreversible actions.
Date issued
2024-05
URI
https://hdl.handle.net/1721.1/156753
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.