Irreversible Actions in Assistance Games with a Dynamic Goal

Mayer, Hendrik T.

dc.contributor.advisor	Hadfield-Menell, Dylan
dc.contributor.author	Mayer, Hendrik T.
dc.date.accessioned	2024-09-16T13:47:04Z
dc.date.available	2024-09-16T13:47:04Z
dc.date.issued	2024-05
dc.date.submitted	2024-07-11T14:36:54.185Z
dc.identifier.uri	https://hdl.handle.net/1721.1/156753
dc.description.abstract	Reinforcement Learning (RL) agents optimize reward functions to learn desirable policies in a variety of important real-world applications such as self-driving cars and recommender systems. However, in practice, it can be very difficult to specify the correct reward function for a complex problem, in what is known as reward misspecifcation. Impact measures provide metrics to determine how robust a particular agent’s behavior is to reward misspecification. This thesis analyzes one particular impact measure: the frequency of irreversible actions that an agent takes. We study this impact measure using a time-varying model of the principal’s preferences. This choice was motivated by two primary considerations. First, many real-world scenarios consist of a principal with time-varying preferences. Second, an agent assuming time-varying preferences may be more averse to performing irreversible actions. In this thesis, we examine principal-agent (human-robot) assistance games in toy grid environments inspired by cooperative inverse reinforcement learning [1], where irreversible actions correspond to removing transitions from a POMDP. In these games, we focus on how the frequency of changes in the principal’s preferences and the optimality of the principal influence the agent’s willingness to take irreversible actions. In 2-node and 4-node assistance games, we find two main results. First, in the presence of a random or approximately optimal human, the robot performs more irreversible actions as the goal state changes position more often. Second, in the presence of an optimal human, the robot rarely performs irreversible actions.
dc.publisher	Massachusetts Institute of Technology
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	Irreversible Actions in Assistance Games with a Dynamic Goal
dc.type	Thesis
dc.description.degree	M.Eng.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Engineering in Electrical Engineering and Computer Science

Files in this item

Name:: mayer-hmayer-meng-eecs-2024-th ...
Size:: 652.3Kb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record