Show simple item record

dc.contributor.advisorBalakrishnan, Hamsa
dc.contributor.authorDing, Wenqi
dc.date.accessioned2025-09-18T14:28:30Z
dc.date.available2025-09-18T14:28:30Z
dc.date.issued2025-05
dc.date.submitted2025-06-23T14:01:44.498Z
dc.identifier.urihttps://hdl.handle.net/1721.1/162710
dc.description.abstractRobotic agents navigating 3D environments must continuously decide their next moves by reasoning about both visual observations and high-level language instructions. However, they plan in a high-dimensional latent space, opaque to human collaborators. Hence, it is difficult for humans to understand the agent’s decision-making process. This lack of interpretability hinders effective collaboration between humans and robots. The key question we are trying to answer in this thesis is: Can we build a unified planning framework that fuses visual and language into a single, interpretable representation, so that humans can interpret robots’ decisions? We propose a model-based planning framework built around pretrained vision-language models (VLMs). We show that VLMs can be used to plan in a unified embedding space, where visual and language representations can be decoded back to human-interpretable forms. Empirical evaluation on vision-language navigation benchmarks demonstrates both improved sample efficiency and transparent decision making, enabling human-in-the-loop planning and more effective human-robot collaboration.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleModel-based Planning for Efficient Task Execution
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record