Toward understanding natural language directions

Kollar, Thomas; Tellex, Stefanie; Roy, Deb; Roy, Nicholas

dc.contributor.author	Kollar, Thomas Fleming
dc.contributor.author	Tellex, Stefanie A.
dc.contributor.author	Roy, Deb K.
dc.contributor.author	Roy, Nicholas
dc.date.accessioned	2011-11-15T13:56:05Z
dc.date.available	2011-11-15T13:56:05Z
dc.date.issued	2010-03
dc.identifier.isbn	978-1-4244-4893-7
dc.identifier.uri	http://hdl.handle.net/1721.1/67029
dc.description.abstract	Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging because diverse words and phrases must be mapped into structures that the robot can understand, and elements in those structures must be grounded in an uncertain environment. We present a system that follows natural language directions by extracting a sequence of spatial description clauses from the linguistic input and then infers the most probable path through the environment given only information about the environmental geometry and detected visible objects. We use a probabilistic graphical model that factors into three key components. The first component grounds landmark phrases such as "the computers" in the perceptual frame of the robot by exploiting co-occurrence statistics from a database of tagged images such as Flickr. Second, a spatial reasoning component judges how well spatial relations such as "past the computers" describe a path. Finally, verb phrases such as "turn right" are modeled according to the amount of change in orientation in the path. Our system follows 60% of the directions in our corpus to within 15 meters of the true destination, significantly outperforming other approaches.	en_US
dc.description.sponsorship	United States. Office of Naval Research (MURI N00014-07-1-0749)	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/1734454.1734553	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike 3.0	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Toward understanding natural language directions	en_US
dc.type	Article	en_US
dc.identifier.citation	Kollar, Thomas et al. “Toward Understanding Natural Language Directions.” Proceeding of the 5th ACM/IEEE International Conference on Human-robot Interaction - HRI ’10. Osaka, Japan, 2010. 259.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Media Laboratory	en_US
dc.contributor.approver	Roy, Deb K.
dc.contributor.mitauthor	Kollar, Thomas Fleming
dc.contributor.mitauthor	Tellex, Stefanie A.
dc.contributor.mitauthor	Roy, Deb K.
dc.contributor.mitauthor	Roy, Nicholas
dc.relation.journal	Proceeding of the 5th ACM/IEEE international conference on Human-robot interaction	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
dspace.orderedauthors	Kollar, Thomas; Tellex, Stefanie; Roy, Deb; Roy, Nicholas	en
dc.identifier.orcid	https://orcid.org/0000-0002-4333-7194
dc.identifier.orcid	https://orcid.org/0000-0002-8293-0492
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Roy_Toward Understanding.pdf
Size:: 1.182Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record