Social inductive biases for reinforcement learning

Adjodah, Dhaval D. K.(Adjodlah, Dhaval Dhamnidhi Kumar)

Author(s)

Adjodah, Dhaval D. K.(Adjodlah, Dhaval Dhamnidhi Kumar)

Download1203140813-MIT.pdf (12.97Mb)

Other Contributors

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Advisor

Alex "Sandy" P. Pentland.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

How can we build machines that collaborate and learn more seamlessly with humans, and with each other? How do we create fairer societies? How do we minimize the impact of information manipulation campaigns, and fight back? How do we build machine learning algorithms that are more sample efficient when learning from each other's sparse data, and under time constraints? At the root of these questions is a simple one: how do agents, human or machines, learn from each other, and can we improve it and apply it to new domains? The cognitive and social sciences have provided innumerable insights into how people learn from data using both passive observation and experimental intervention. Similarly, the statistics and machine learning communities have formalized learning as a rigorous and testable computational process.

There is a growing movement to apply insights from the cognitive and social sciences to improving machine learning, as well as opportunities to use machine learning as a sandbox to test, simulate and expand ideas from the cognitive and social sciences. A less researched and fertile part of this intersection is the modeling of social learning: past work has been more focused on how agents can learn from the 'environment', and there is less work that borrows from both communities to look into how agents learn from each other. This thesis presents novel contributions into the nature and usefulness of social learning as an inductive bias for reinforced learning.

I start by presenting the results from two large-scale online human experiments: first, I observe Dunbar cognitive limits that shape and limit social learning in two different social trading platforms, with the additional contribution that synthetic financial bots that transcend human limitations can obtain higher profits even when using naive trading strategies. Second, I devise a novel online experiment to observe how people, at the individual level, update their belief of future financial asset prices (e.g. S&P 500 and Oil prices) from social information. I model such social learning using Bayesian models of cognition, and observe that people make strong distributional assumptions on the social data they observe (e.g. assuming that the likelihood data is unimodal).

I were fortunate to collect one round of predictions during the Brexit market instability, and find that social learning leads to higher performance than when learning from the underlying price history (the environment) during such volatile times. Having observed the cognitive limits and biases people exhibit when learning from other agents, I present an motivational example of the strength of inductive biases in reinforcement learning: I implement a learning model with a relational inductive bias that pre-processes the environment state into a set of relationships between entities in the world. I observe strong improvements in performance and sample efficiency, and even observe the learned relationships to be strongly interpretable.

Finally, given that most modern deep reinforcement learning algorithms are distributed (in that they have separate learning agents), I investigate the hypothesis that viewing deep reinforcement learning as a social learning distributed search problem could lead to strong improvements. I do so by creating a fully decentralized, sparsely-communicating and scalable learning algorithm, and observe strong learning improvements with lower communication bandwidth usage (between learning agents) when using communication topologies that naturally evolved due to social learning in humans. Additionally, I provide a theoretical upper bound (that agrees with our empirical results) regarding which communication topologies lead to the largest learning performance improvement.

Given a future increasingly filled with decentralized autonomous machine learning systems that interact with humans, there is an increasing need to understand social learning to build resilient, scalable and effective learning systems, and this thesis provides insights into how to build such systems.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, September, 2019

Cataloged from the official PDF of thesis. "The Table of Contents does not accurately represent the page numbering"--Disclaimer page.

Includes bibliographical references (pages 117-126).

Date issued

2019

URI

https://hdl.handle.net/1721.1/128415

Department

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Publisher

Massachusetts Institute of Technology

Keywords

Program in Media Arts and Sciences

Collections

Doctoral Theses