Show simple item record

dc.contributor.advisorUhler, Caroline
dc.contributor.authorWelch, Ryan
dc.date.accessioned2025-10-06T17:36:59Z
dc.date.available2025-10-06T17:36:59Z
dc.date.issued2025-05
dc.date.submitted2025-06-23T14:04:11.515Z
dc.identifier.urihttps://hdl.handle.net/1721.1/162961
dc.description.abstractThe problem of pure exploration in sequential decision-making is to identify strategies for efficiently gathering information to uncover hidden properties of an environment. This challenge arises in many practical domains, including clinical diagnostics, recommender systems, and educational testing, where data collection is costly and the effectiveness of exploration is critical. Efficient exploration in these contexts strongly depends on exploiting underlying structural relationships within the environment. For instance, recognizing that multiple medical tests may provide overlapping information can reduce the number of tests required to make a diagnosis. Existing exploration approaches drawn from reinforcement learning and active hypothesis testing typically rely on heuristic strategies that require explicit prior assumptions about such structural information. However, when this information is unknown, heuristic methods often lead to redundant exploration, significantly limiting their practical utility in high-stakes domains. Furthermore, these existing approaches do not leverage past experience to improve their exploration efficiency over time. To overcome these limitations, we introduce In-Context Pure Exploration (ICPE), a novel meta-learning framework capable of autonomously discovering and exploiting latent environmental structures across related tasks to guide efficient exploration. ICPE leverages the in-context learning and sequence-modeling capabilities of transformers, combined with supervised learning and deep reinforcement learning techniques to learn exploration strategies directly from experience. Through extensive experiments on synthetic and semi-synthetic exploration tasks, we demonstrate that ICPE is able to efficiently explore in deterministic, stochastic and highly structured environments without relying on any explicit inductive biases. Our results highlight the potential of ICPE to enable more practical exploration strategies suitable for real-world decision-making contexts.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleMeta-Learning Exploration Strategies with Decision Transformers
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record