Model-based Planning for Efficient Task Execution

Ding, Wenqi

dc.contributor.advisor	Balakrishnan, Hamsa
dc.contributor.author	Ding, Wenqi
dc.date.accessioned	2025-09-18T14:28:30Z
dc.date.available	2025-09-18T14:28:30Z
dc.date.issued	2025-05
dc.date.submitted	2025-06-23T14:01:44.498Z
dc.identifier.uri	https://hdl.handle.net/1721.1/162710
dc.description.abstract	Robotic agents navigating 3D environments must continuously decide their next moves by reasoning about both visual observations and high-level language instructions. However, they plan in a high-dimensional latent space, opaque to human collaborators. Hence, it is difficult for humans to understand the agent’s decision-making process. This lack of interpretability hinders effective collaboration between humans and robots. The key question we are trying to answer in this thesis is: Can we build a unified planning framework that fuses visual and language into a single, interpretable representation, so that humans can interpret robots’ decisions? We propose a model-based planning framework built around pretrained vision-language models (VLMs). We show that VLMs can be used to plan in a unified embedding space, where visual and language representations can be decoded back to human-interpretable forms. Empirical evaluation on vision-language navigation benchmarks demonstrates both improved sample efficiency and transparent decision making, enabling human-in-the-loop planning and more effective human-robot collaboration.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Model-based Planning for Efficient Task Execution
dc.type	Thesis
dc.description.degree	M.Eng.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Engineering in Electrical Engineering and Computer Science

Files in this item

Name:: ding-wenqi22-meng-eecs-2025-th ...
Size:: 4.830Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record