MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

On the Learnability of General Reinforcement-Learning Objectives

Author(s)
Yang, Cambridge
Thumbnail
DownloadThesis PDF (7.362Mb)
Advisor
Carbin, Michael
Terms of use
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/
Metadata
Show full item record
Abstract
Reinforcement learning enables agents to learn decision-making policies in unknown environments to achieve specified objectives. Traditionally, these objectives are expressed through reward functions, enabling well-established guarantees on learning near-optimal policies with a high probability — a property known as probably approximately correct (PAC) -learnability. However, reward functions often serve as imperfect surrogates for true objectives, leading to reward hacking and undermining these guarantees. This thesis formalizes the specification and learnability of general reinforcement-learning objectives beyond rewards, addressing fundamental questions of expressivity and policy learnability. I examine three increasingly expressive classes of objectives: (1) Linear Temporal Logic (LTL) objectives, which extend conventional scalar rewards to temporal specifications of behavior and have garnered recent attention, (2) Computable objectives, encompassing a broad class of structured, algorithmically definable objectives and (3) Non-computable objectives, representing general objectives beyond the computable class. For LTL objectives, I prove that only finitary LTL objectives are PAC-learnable, while infinite-horizon LTL objectives are inherently intractable under the PAC-MDP framework. Extending this result, I establish a general criterion: an objective is PAC-learnable if it is continuous and computable. This criterion facilitates the establishment of PAC-learnability for various existing classes of objectives with unknown PAC-learnability and informs the design of new, learnable objective specifications. Finally, for non-computable objectives, I introduce limit PAC-learnability, a practical relaxation where a sequence of computable, PAC-learnable objectives approximates a non-computable objective. I formalize a universal representation of non-computable objectives using nested limits of computable functions and provide sufficient conditions under which limit PAC-learnability holds. By establishing a theoretical foundation for general RL objectives, this thesis advances our understanding of which objectives are learnable, how they can be specified, and how agents can effectively learn policies to optimize them. These results contribute to the broader goal of designing intelligent agents that align with expressive, formally defined objectives—moving beyond the limitations of reward-based surrogates.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/164131
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.