Show simple item record

dc.contributor.advisorRus, Daniela L.
dc.contributor.advisorKaraman, Sertac
dc.contributor.authorSander, Ryan M.
dc.date.accessioned2022-01-14T14:42:02Z
dc.date.available2022-01-14T14:42:02Z
dc.date.issued2021-06
dc.date.submitted2021-06-17T20:14:14.251Z
dc.identifier.urihttps://hdl.handle.net/1721.1/138972
dc.description.abstractThe human brain is remarkably sample efficient, capable of learning complex behaviors given limited experience [16]. This sample efficiency property is crucial for effectively training robust deep reinforcement learning agents on continuous control tasks - when limited experience is available, poor sample efficiency can yield sub-optimal and unstable policies. To improve sample efficiency in these tasks, we propose Neighborhood Mixup Experience Replay (NMER) and Bayesian Interpolated Experience Replay (BIER), modular replay buffers that interpolate transitions with their closest neighbors in normalized state-action space. NMER preserves a locally linear approximation of the transition manifold by only interpolating transitions with similar stateaction features. BIER expands upon NMER by predicting interpolated transitions queried by NMER using learned Gaussian Process Regression models defined over a transition’s neighborhood. These interpolated transitions, predicted via Bayesian linear smoothing, are then used to update the policy and value functions of deep reinforcement learning agents in a likelihood-weighted fashion. NMER and BIER achieve greater sample efficiency than other state-of-the-art replay buffers when evaluated on model-free, off-policy reinforcement learning algorithms and OpenAI Gym MuJoCo environments. This improved sample efficiency can enable agents to learn robust and generalizable policies on continuous control tasks in settings where data is limited, such as many real-world robotics tasks.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleInterpolated Experience Replay for Improved Sample Efficiency of Model-Free Deep Reinforcement Learning Algorithms
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record