| dc.contributor.advisor | Kaelbling, Leslie Pack | |
| dc.contributor.advisor | Lozano-Pérez, Tomás | |
| dc.contributor.author | Fang, Xiaolin | |
| dc.date.accessioned | 2026-01-20T19:45:45Z | |
| dc.date.available | 2026-01-20T19:45:45Z | |
| dc.date.issued | 2025-09 | |
| dc.date.submitted | 2025-09-15T14:40:25.785Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/164567 | |
| dc.description.abstract | Advancing robotic manipulation to achieve generalization across diverse goals, environments, and embodiments is a critical challenge in robotics research. While the availability of data and large-scale training has brought exciting progress in robotics manipulation, current methods often struggle with generalizing to unseen, unstructured environments and solving long-horizon tasks. In this thesis, I will present my work in robot learning and planning that enables multi-step manipulation in partially observable environments, towards general-purpose embodied agents. Specifically, I will talk about my work in 1) constructing a modular framework that estimates affordances with learned perception models with task-and-motion-planning (TAMP) for object rearrangement in unstructured scenes, 2) learning generative diffusion models of robot skills, which can be composed to solve unseen combination of environmental constraints through infeference-time optimization, 3) leveraging large vision-language models (VLMs) in building task-oriented visual abstractions, allowing skills to generalize across different environments with only 5 to 10 demonstrations. Together, these approaches contribute to the generality and scalability of embodied agents towards solving real-world manipulation in unstructured environments. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | In Copyright - Educational Use Permitted | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
| dc.title | Generalizable Robot Manipulation through Unified Perception, Policy Learning, and Planning | |
| dc.type | Thesis | |
| dc.description.degree | Ph.D. | |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| mit.thesis.degree | Doctoral | |
| thesis.degree.name | Doctor of Philosophy | |