Show simple item record

dc.contributor.authorAbu-Khalaf, Murad
dc.contributor.authorKaraman, Sertac
dc.contributor.authorRus, Daniela
dc.date.accessioned2021-11-02T18:59:32Z
dc.date.available2021-11-02T18:59:32Z
dc.date.issued2019-12
dc.identifier.urihttps://hdl.handle.net/1721.1/137170
dc.description.abstractWe propose controller synthesis for state regulation problems in which a human operator shares control with an autonomy system, running in parallel. The autonomy system continuously improves over human action, with minimal intervention, and can take over full-control if necessary. It additively combines user input with an adaptive optimal corrective signal to drive the plant. It is adaptive in the sense that it neither estimates nor requires a model of the human's action policy, or the internal dynamics of the plant, and can adjust to changes in both. Our contribution is twofold; first, a new controller synthesis for shared control which we formulate as an adaptive optimal control problem for continuous-time linear systems and solve it online as a human-in-the-loop reinforcement learning. The result is an architecture that we call shared linear quadratic regulator (sLQR). Second, we provide new analysis of reinforcement learning for continuous-time linear systems in two parts. In the first analysis part, we avoid learning along a single state-space trajectory which we show leads to data collinearity under certain conditions. In doing so, we make a clear separation between exploitation of learned policies and exploration of the state-space, and propose an exploration scheme that requires switching to new state-space trajectories rather than injecting noise continuously while learning. This avoidance of continuous noise injection minimizes interference with human action, and avoids bias in the convergence to the stabilizing solution of the underlying algebraic Riccati equation. We show that exploring a minimum number of pairwise distinct state-space trajectories is necessary to avoid collinearity in the learning data. In the second analysis part, we show conditions under which existence and uniqueness of solutions can be established for off-policy reinforcement learning in continuous-time linear systems; namely, prior knowledge of the input matrix.en_US
dc.language.isoen
dc.publisherIEEEen_US
dc.relation.isversionof10.1109/cdc40024.2019.9029617en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleShared Linear Quadratic Regulation Control: A Reinforcement Learning Approachen_US
dc.typeArticleen_US
dc.identifier.citationAbu-Khalaf, Murad, Karaman, Sertac and Rus, Daniela. 2019. "Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach." Proceedings of the IEEE Conference on Decision and Control, 2019-December.
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systems
dc.relation.journalProceedings of the IEEE Conference on Decision and Controlen_US
dc.eprint.versionOriginal manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2021-04-15T17:35:15Z
dspace.orderedauthorsAbu-Khalaf, M; Karaman, S; Rus, Den_US
dspace.date.submission2021-04-15T17:35:16Z
mit.journal.volume2019-Decemberen_US
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record