Guiding search in continuous state-action spaces by learning an action sampler from off-target search experience

Kaelbling, Leslie P.; Lozano-Pérez, Tomás; Kim, Beomjoon

dc.contributor.author	Kaelbling, Leslie P.
dc.contributor.author	Lozano-Pérez, Tomás
dc.contributor.author	Kim, Beomjoon
dc.date.accessioned	2021-11-08T16:46:03Z
dc.date.available	2021-11-08T16:46:03Z
dc.date.issued	2018
dc.identifier.uri	https://hdl.handle.net/1721.1/137707
dc.description.abstract	Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. In robotics, it is essential to be able to plan efficiently in high-dimensional continuous state-action spaces for long horizons. For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth. In this paper, we present an approach that guides search in continuous spaces for generic planners by learning an action sampler from past search experience. We use a Generative Adversarial Network (GAN) to represent an action sampler, and address an important issue: search experience consists of a relatively large number of actions that are not on a solution path and a relatively small number of actions that actually are on a solution path. We introduce a new technique, based on an importance-ratio estimation method, for using samples from a non-target distribution to make GAN learning more data-efficient. We provide theoretical guarantees and empirical evaluation in three challenging continuous robot planning problems to illustrate the effectiveness of our algorithm.	en_US
dc.language.iso	en
dc.relation.isversionof	https://openreview.net/forum?id=bh-hFWQ7zJyJ	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Guiding search in continuous state-action spaces by learning an action sampler from off-target search experience	en_US
dc.type	Article	en_US
dc.identifier.citation	Kaelbling, Leslie P., Lozano-Pérez, Tomás and Kim, Beomjoon. 2018. "Guiding search in continuous state-action spaces by learning an action sampler from off-target search experience."
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2019-06-04T15:28:25Z
dspace.date.submission	2019-06-04T15:28:26Z
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: kim-aaai18.pdf
Size:: 1.828Mb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record