Show simple item record

dc.contributor.authorMichini, Bernard J.
dc.contributor.authorHow, Jonathan P.
dc.contributor.authorCutler, Mark Johnson
dc.date.accessioned2015-05-08T18:42:15Z
dc.date.available2015-05-08T18:42:15Z
dc.date.issued2013-05
dc.identifier.isbn978-1-4673-5643-5
dc.identifier.isbn978-1-4673-5641-1
dc.identifier.issn1050-4729
dc.identifier.urihttp://hdl.handle.net/1721.1/96946
dc.description.abstractReward learning from demonstration is the task of inferring the intents or goals of an agent demonstrating a task. Inverse reinforcement learning methods utilize the Markov decision process (MDP) framework to learn rewards, but typically scale poorly since they rely on the calculation of optimal value functions. Several key modifications are made to a previously developed Bayesian nonparametric inverse reinforcement learning algorithm that avoid calculation of an optimal value function and no longer require discretization of the state or action spaces. Experimental results given demonstrate the ability of the resulting algorithm to scale to larger problems and learn in domains with continuous demonstrations.en_US
dc.description.sponsorshipUnited States. Office of Naval Research (Autonomy Program Contract N000140910625)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/ICRA.2013.6630592en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleScalable reward learning from demonstrationen_US
dc.typeArticleen_US
dc.identifier.citationMichini, Bernard, Mark Cutler, and Jonathan P. How. “Scalable Reward Learning from Demonstration.” 2013 IEEE International Conference on Robotics and Automation (May 2013).en_US
dc.contributor.departmentMassachusetts Institute of Technology. Aerospace Controls Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Aeronautics and Astronauticsen_US
dc.contributor.mitauthorMichini, Bernard J.en_US
dc.contributor.mitauthorCutler, Mark Johnsonen_US
dc.contributor.mitauthorHow, Jonathan P.en_US
dc.relation.journalProceedings of the 2013 IEEE International Conference on Robotics and Automationen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsMichini, Bernard; Cutler, Mark; How, Jonathan P.en_US
dc.identifier.orcidhttps://orcid.org/0000-0001-8576-1930
dc.identifier.orcidhttps://orcid.org/0000-0003-0776-7901
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record