Show simple item record

dc.contributor.authorChowdhary, Girish
dc.contributor.authorLiu, Miao
dc.contributor.authorGrande, Robert
dc.contributor.authorWalsh, Thomas
dc.contributor.authorHow, Jonathan P.
dc.contributor.authorCarin, Lawrence
dc.date.accessioned2015-05-11T19:13:37Z
dc.date.available2015-05-11T19:13:37Z
dc.date.issued2014-07
dc.date.submitted2014-05
dc.identifier.issn2329-9266
dc.identifier.urihttp://hdl.handle.net/1721.1/96958
dc.description.abstractAn off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations.en_US
dc.description.sponsorshipUnited States. Office of Naval Research (Autonomy Program N000140910625)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/JAS.2014.7004680en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceOther univ. web domainen_US
dc.titleOff-policy reinforcement learning with Gaussian processesen_US
dc.typeArticleen_US
dc.identifier.citationChowdhary, Girish, Miao Liu, Robert Grande, Thomas Walsh, Jonathan How, and Lawrence Carin. "Off-policy reinforcement learning with Gaussian processes." IEEE/CAA Journal of Automatica Sinica, Vol. 1, No. 3, July 2014.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Aerospace Controls Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Aeronautics and Astronauticsen_US
dc.contributor.mitauthorGrande, Roberten_US
dc.contributor.mitauthorWalsh, Thomasen_US
dc.contributor.mitauthorHow, Jonathan P.en_US
dc.relation.journalIEEE/CAA Journal of Automatica Sinicaen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsChowdhary, Girish; Liu, Miao; Grande, Robert; Walsh, Thomas; How, Jonathan; Carin, Lawrenceen_US
dc.identifier.orcidhttps://orcid.org/0000-0001-8576-1930
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record