Near-Optimal Learning and Planning in Separated Latent MDPs

Chen, Fan

dc.contributor.advisor	Daskalakis, Constantinos
dc.contributor.advisor	Rakhlin, Alexander
dc.contributor.author	Chen, Fan
dc.date.accessioned	2025-03-27T16:58:58Z
dc.date.available	2025-03-27T16:58:58Z
dc.date.issued	2025-02
dc.date.submitted	2025-03-04T17:27:24.828Z
dc.identifier.uri	https://hdl.handle.net/1721.1/158934
dc.description.abstract	We study computational and statistical aspects of learning Latent Markov Decision Processes (LMDPs). In this model, the learner interacts with an MDP drawn at the beginning of each epoch from an unknown mixture of MDPs. To sidestep known impossibility results, we consider several notions of δ-separation of the constituent MDPs. The main thrust of this paper is in establishing a nearly-sharp statistical threshold for the horizon length necessary for efficient learning. On the computational side, we show that under a weaker assumption of separability under the optimal policy, there is a quasi-polynomial algorithm with time complexity scaling in terms of the statistical threshold. We further show a near-matching time complexity lower bound under the exponential time hypothesis.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Near-Optimal Learning and Planning in Separated Latent MDPs
dc.type	Thesis
dc.description.degree	S.M.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Science in Electrical Engineering and Computer Science

Files in this item

Name:: chen-fanchen-sm-eecs-2025-thes ...
Size:: 987.5Kb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record