Discovering the Language of Actions

Sharma, Pratyusha

dc.contributor.advisor	Torralba, Antonio
dc.contributor.advisor	Andreas, Jacob
dc.contributor.author	Sharma, Pratyusha
dc.date.accessioned	2022-06-15T13:10:21Z
dc.date.available	2022-06-15T13:10:21Z
dc.date.issued	2022-02
dc.date.submitted	2022-03-04T20:59:48.549Z
dc.identifier.uri	https://hdl.handle.net/1721.1/143293
dc.description.abstract	This thesis takes a look at discovering language-like discrete infinities for actions. How can a stream of continuous data be parsed into skills/concepts and can we tie the decision of what may be the right set of skills with the problem of generating plans over a continuous action space as in the original stream of data? Can we utilize supervision from aligning parallel language instructions to scaffold the discovery of these named primitives of actions from interactions? Here, we present a framework for learning hierarchical policies from demonstrations, using sparse natural language annotations to guide the discovery of reusable skills for autonomous decision-making. It is formulated as a generative model of action sequences in which goals generate sequences of high-level subtask descriptions, and these descriptions generate sequences of low-level actions. The thesis describes how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks, using only a small number of seed annotations to ground language in action. In trained models, the space of natural language commands indexes a combinatorial library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals. The approach is evaluated in the ALFRED household simulation environment, providing natural language annotations for only 10% of demonstrations. It completes more than twice as many tasks as a standard approach to learning from demonstrations, matching the performance of instruction following models with access to ground-truth plans during both training and evaluation. 1 1Code, data, and additional visualizations are available at https://sites.google.com/view/ skill-induction-latent-lang/.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Discovering the Language of Actions
dc.type	Thesis
dc.description.degree	S.M.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Science in Electrical Engineering and Computer Science

Files in this item

Name:: Sharma_pratyuss_SM_EECS_2022_t ...
Size:: 4.511Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record