Learning perceptually grounded word meanings from unaligned parallel data
Author(s)Tellex, Stefanie A.; Thaker, Pratiksha R.; Joseph, Joshua Mason; Roy, Nicholas
MetadataShow full item record
In order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach to grounded language acquisition which is capable of jointly learning a policy for following natural language commands such as “Pick up the tire pallet,” as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words “the tire pallet” and a specific object in the environment. Our approach assumes a parametric form for the policy that the robot uses to choose actions in response to a natural language command that factors based on the structure of the language. We use a gradient method to optimize model parameters. Our evaluation demonstrates the effectiveness of the model on a corpus of commands given to a robotic forklift by untrained users.
DepartmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Aeronautics and Astronautics; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Tellex, Stefanie, Pratiksha Thaker, Joshua Joseph, and Nicholas Roy. “Learning perceptually grounded word meanings from unaligned parallel data.” Machine Learning (May 18, 2013).
Author's final manuscript