Learning perceptually grounded word meanings from unaligned parallel data

Tellex, Stefanie A.; Thaker, Pratiksha R.; Joseph, Joshua Mason; Roy, Nicholas

Author(s)

Tellex, Stefanie A.; Thaker, Pratiksha R.; Joseph, Joshua Mason; Roy, Nicholas

DownloadRoy_Learning perceptually.pdf (326.4Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/

Metadata

Show full item record

Abstract

In order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach to grounded language acquisition which is capable of jointly learning a policy for following natural language commands such as “Pick up the tire pallet,” as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words “the tire pallet” and a specific object in the environment. Our approach assumes a parametric form for the policy that the robot uses to choose actions in response to a natural language command that factors based on the structure of the language. We use a gradient method to optimize model parameters. Our evaluation demonstrates the effectiveness of the model on a corpus of commands given to a robotic forklift by untrained users.

Date issued

2013-05

URI

http://hdl.handle.net/1721.1/81275

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Aeronautics and Astronautics; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Machine Learning

Publisher

Springer-Verlag

Citation

Tellex, Stefanie, Pratiksha Thaker, Joshua Joseph, and Nicholas Roy. “Learning perceptually grounded word meanings from unaligned parallel data.” Machine Learning (May 18, 2013).

Version: Author's final manuscript

ISSN

0885-6125

1573-0565

Collections

MIT Open Access Articles

DSpace@MIT