| dc.contributor.author | Michini, Bernard J. | |
| dc.contributor.author | How, Jonathan P. | |
| dc.contributor.author | Cutler, Mark Johnson | |
| dc.date.accessioned | 2015-05-08T18:42:15Z | |
| dc.date.available | 2015-05-08T18:42:15Z | |
| dc.date.issued | 2013-05 | |
| dc.identifier.isbn | 978-1-4673-5643-5 | |
| dc.identifier.isbn | 978-1-4673-5641-1 | |
| dc.identifier.issn | 1050-4729 | |
| dc.identifier.uri | http://hdl.handle.net/1721.1/96946 | |
| dc.description.abstract | Reward learning from demonstration is the task of inferring the intents or goals of an agent demonstrating a task. Inverse reinforcement learning methods utilize the Markov decision process (MDP) framework to learn rewards, but typically scale poorly since they rely on the calculation of optimal value functions. Several key modifications are made to a previously developed Bayesian nonparametric inverse reinforcement learning algorithm that avoid calculation of an optimal value function and no longer require discretization of the state or action spaces. Experimental results given demonstrate the ability of the resulting algorithm to scale to larger problems and learn in domains with continuous demonstrations. | en_US |
| dc.description.sponsorship | United States. Office of Naval Research (Autonomy Program Contract N000140910625) | en_US |
| dc.language.iso | en_US | |
| dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
| dc.relation.isversionof | http://dx.doi.org/10.1109/ICRA.2013.6630592 | en_US |
| dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
| dc.source | MIT web domain | en_US |
| dc.title | Scalable reward learning from demonstration | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Michini, Bernard, Mark Cutler, and Jonathan P. How. “Scalable Reward Learning from Demonstration.” 2013 IEEE International Conference on Robotics and Automation (May 2013). | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Aerospace Controls Laboratory | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics | en_US |
| dc.contributor.mitauthor | Michini, Bernard J. | en_US |
| dc.contributor.mitauthor | Cutler, Mark Johnson | en_US |
| dc.contributor.mitauthor | How, Jonathan P. | en_US |
| dc.relation.journal | Proceedings of the 2013 IEEE International Conference on Robotics and Automation | en_US |
| dc.eprint.version | Author's final manuscript | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dspace.orderedauthors | Michini, Bernard; Cutler, Mark; How, Jonathan P. | en_US |
| dc.identifier.orcid | https://orcid.org/0000-0001-8576-1930 | |
| dc.identifier.orcid | https://orcid.org/0000-0003-0776-7901 | |
| mit.license | OPEN_ACCESS_POLICY | en_US |
| mit.metadata.status | Complete | |