MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Meta-Learning Exploration Strategies with Decision Transformers

Author(s)
Welch, Ryan
Thumbnail
DownloadThesis PDF (1.216Mb)
Advisor
Uhler, Caroline
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
The problem of pure exploration in sequential decision-making is to identify strategies for efficiently gathering information to uncover hidden properties of an environment. This challenge arises in many practical domains, including clinical diagnostics, recommender systems, and educational testing, where data collection is costly and the effectiveness of exploration is critical. Efficient exploration in these contexts strongly depends on exploiting underlying structural relationships within the environment. For instance, recognizing that multiple medical tests may provide overlapping information can reduce the number of tests required to make a diagnosis. Existing exploration approaches drawn from reinforcement learning and active hypothesis testing typically rely on heuristic strategies that require explicit prior assumptions about such structural information. However, when this information is unknown, heuristic methods often lead to redundant exploration, significantly limiting their practical utility in high-stakes domains. Furthermore, these existing approaches do not leverage past experience to improve their exploration efficiency over time. To overcome these limitations, we introduce In-Context Pure Exploration (ICPE), a novel meta-learning framework capable of autonomously discovering and exploiting latent environmental structures across related tasks to guide efficient exploration. ICPE leverages the in-context learning and sequence-modeling capabilities of transformers, combined with supervised learning and deep reinforcement learning techniques to learn exploration strategies directly from experience. Through extensive experiments on synthetic and semi-synthetic exploration tasks, we demonstrate that ICPE is able to efficiently explore in deterministic, stochastic and highly structured environments without relying on any explicit inductive biases. Our results highlight the potential of ICPE to enable more practical exploration strategies suitable for real-world decision-making contexts.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162961
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.