Mutual information-based gradient-ascent control for distributed robotics

Julian, Brian John

Author(s)

Julian, Brian John

DownloadFull printable version (18.29Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Daniela L. Rus.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

This thesis presents the derivation, analysis, and implementation of a novel class of decentralized mutual information-based gradient-ascent controllers that continuously move robots equipped with sensors to better observe their environment. We begin with the fundamental problem of deploying a single ground robot equipped with a range sensor and tasked to build an occupancy grid map. The desired explorative behaviors of the robot for occupancy grid mapping highlight the correlation between the information content and the spatial realization of the robot's range measurements. We prove that any occupancy grid controller tasked to maximize a mutual information reward function is eventually attracted to unexplored space, i.e., areas of highest uncertainty. We show that mutual information encodes geometric relationships that are fundamental to robot control and yields geometrically relevant reward surfaces on which robots can navigate. Taking inspiration from geometric-based approaches to distributed robot coordination, we show that many multi-robot inference tasks can be cast in terms of an optimization problem. This optimization problem defines the task of minimizing the conditional entropy associated with the robots' inferred beliefs of the environment, which is equivalent to maximizing the mutual information between the environment state and the robots' next joint observation. Given simple robot dynamics and few probabilistic assumptions, none of which involve Gaussianity, we derive a gradientascent solution approach to these optimization problems that is convergent between sensor observations and locally optimal. More formally, we invoke LaSalle's Invariance Principle to prove that, given enough time between consecutive joint observations, robots following the gradient of mutual information will converge to goal positions that locally maximize the expected information gain resulting from the next observation. We show that the algorithmic implementation of the generalized gradient-ascent controller is not readily distributed among multiple robots, and thus sample-based methods are introduced to distributively approximate the likelihoods of the robots' joint observations. Not only are the involved non-parametric representations compatible with any type of Bayesian filter, but the computational complexities of the resulting decentralized controllers are independent with respect to the number of robots. Concerning the distributed approximations, we give two example consensus-based algorithms that run on an undirected network graph. The first consensus-based algorithm approximates discrete measurement probabilities, while the second approximates continuous likelihood distributions. We show that these anytime approximations provably converge to the correct values on a static and connected network graph without knowledge of the number of robots in the network or the corresponding graph's topology. Lastly, we incorporate the resulting consensus-based algorithms into both a hardware system and a simulation environment to allow for decentralized controller evaluation under non-ideal network settings. For the hardware experiments, the task is to infer the state of a bounded, planar environment by deploying five quadrotor flying robots with simulated sensors in both indoor and outdoor settings. For the numerical simulations, Monte Carlo-based analyses are performed for 100 robots, where each robot is simulated on an independent computer node within a computer cluster system. Simulations are also performed for 1000 robots using a single workstation computer equipped with a multicore GPU-enabled graphics card. The results from both the hardware experiments and numerical simulations validate our theoretical and computational claims throughout the thesis.

Description

Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 167-179).

Date issued

2013

URI

http://hdl.handle.net/1721.1/84889

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses