Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction

Seyde, Tim N.

dc.contributor.advisor	Rus, Daniela
dc.contributor.author	Seyde, Tim N.
dc.date.accessioned	2024-09-03T21:11:31Z
dc.date.available	2024-09-03T21:11:31Z
dc.date.issued	2024-05
dc.date.submitted	2024-07-10T13:02:05.050Z
dc.identifier.uri	https://hdl.handle.net/1721.1/156611
dc.description.abstract	Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes.
dc.publisher	Massachusetts Institute of Technology
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: seyde-tseyde-phd-eecs-2024-the ...
Size:: 73.78Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record