One-shot visual appearance learning for mobile manipulation

Walter, M. R.; Friedman, Y.; Antone, M.; Teller, S.

dc.contributor.author	Walter, Matthew R.
dc.contributor.author	Friedman, Yuli
dc.contributor.author	Antone, Matthew
dc.contributor.author	Teller, Seth
dc.date.accessioned	2012-10-02T14:57:19Z
dc.date.available	2012-10-02T14:57:19Z
dc.date.issued	2012-04
dc.identifier.issn	0278-3649
dc.identifier.issn	1741-3176
dc.identifier.uri	http://hdl.handle.net/1721.1/73543
dc.description.abstract	We describe a vision-based algorithm that enables a robot to robustly detect specific objects in a scene following an initial segmentation hint from a human user. The novelty lies in the ability to ‘reacquire’ objects over extended spatial and temporal excursions within challenging environments based upon a single training example. The primary difficulty lies in achieving an effective reacquisition capability that is robust to the effects of local clutter, lighting variation, and object relocation. We overcome these challenges through an adaptive detection algorithm that automatically generates multiple-view appearance models for each object online. As the robot navigates within the environment and the object is detected from different viewpoints, the one-shot learner opportunistically and automatically incorporates additional observations into each model. In order to overcome the effects of ‘drift’ common to adaptive learners, the algorithm imposes simple requirements on the geometric consistency of candidate observations. Motivating our reacquisition strategy is our work developing a mobile manipulator that interprets and autonomously performs commands conveyed by a human user. The ability to detect specific objects and reconstitute the user’s segmentation hints enables the robot to be situationally aware. This situational awareness enables rich command and control mechanisms and affords natural interaction. We demonstrate one such capability that allows the human to give the robot a ‘guided tour’ of named objects within an outdoor environment and, hours later, to direct the robot to manipulate those objects by name using spoken instructions. We implemented our appearance-based detection strategy on our robotic manipulator as it operated over multiple days in different outdoor environments. We evaluate the algorithm’s performance under challenging conditions that include scene clutter, lighting and viewpoint variation, object ambiguity, and object relocation. The results demonstrate a reacquisition capability that is effective in real-world settings.	en_US
dc.description.sponsorship	United States. Air Force (Contract FA8721-05-C-0002)	en_US
dc.language.iso	en_US
dc.publisher	Sage Publications	en_US
dc.relation.isversionof	http://dx.doi.org/10.1177/0278364911435515	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike 3.0	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/	en_US
dc.source	MIT web domain	en_US
dc.title	One-shot visual appearance learning for mobile manipulation	en_US
dc.type	Article	en_US
dc.identifier.citation	Walter, M. R. et al. “One-shot Visual Appearance Learning for Mobile Manipulation.” The International Journal of Robotics Research 31.4 (2012): 554–567.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.mitauthor	Walter, Matthew R.
dc.contributor.mitauthor	Teller, Seth
dc.relation.journal	International Journal of Robotics Research	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dspace.orderedauthors	Walter, M. R.; Friedman, Y.; Antone, M.; Teller, S.	en
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Teller_One-shot visual appeara ...
Size:: 7.569Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record