Information theoretic bounds for distributed computation

Ayaso, Ola.

Author(s)

Ayaso, Ola.

DownloadFull printable version (7.716Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.

Advisor

Munther A. Dahleh and Devavrat Shah.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

(cont.) In the second formulation, each node has an initial real-valued measurement. Nodes communicate their values via a network with fixed topology and noisy channels between nodes that are linked. The goal is for each node to estimate a given function of all the initial values in the network, so that the mean square error in the estimate is within a prescribed interval. Here, the nodes do not know the distribution of the source, but have unlimited computation power to run whatever algorithm needed to ensure the mean square error criterion. The question is: how does the communication network impact the time until the performance criterion is guaranteed. Using Information Theoretic inequalities, I derive an algorithm-independent lower bound on the computation time. The bound is a function of the uncertainty in the function to be estimated, via its differential entropy, and the desired accuracy level, as specified by the mean square error criterion. Next, I demonstrate the use of this bound in a scenario where nodes communicate through erasure channels to learn a linear function of all the node's initial values. For this scenario, I describe an algorithm whose running time, until with high probability all nodes' estimates lie within a prescribed interval of the true value, is reciprocally related to the "conductance." Conductance quantifies the information flow "bottle-neck" in the network and hence captures the effect of the topology and capacities. Using the lower bound, I show that the running time of any algorithm that guarantees the aforementioned probability criterion, must scale reciprocally with conductance. Thus, the lower bound is tight in capturing the effect of network topology via conductance; conversely, the running time of our algorithm is optimal with respect to its dependence on conductance.

In this thesis, I explore via two formulations the impact of communication constraints on distributed computation. In both formulations, nodes make partial observations of an underlying source. They communicate in order to compute a given function of all the measurements in the network, to within a desired level of error. Such computation in networks arises in various contexts, like wireless and sensor networks, consensus and belief propagation with bit constraints, and estimation of a slowly evolving process. By utilizing Information Theoretic formulations and tools, I obtain code- or algorithm-independent lower bounds that capture fundamental limits imposed by the communication network. In the first formulation, each node samples a component of a source whose values belong to a field of order q. The nodes utilize their knowledge of the joint probability mass function of the components together with the function to be computed to efficiently compress their messages, which are then broadcast. The question is: how many bits per sample are necessary and sufficient for each node to broadcast in order for the probability of decoding error to approach zero as the number of samples grows. I find that when there are two nodes in the network seeking to compute the sample-wise modulo-q sum of their measurements, a node compressing so that the other can compute the modulo-q sum is no more efficient than its compressing so that the actual data sequence is decoded. However, when there are more than two nodes, we demonstrate that there exists a joint probability mass function for which nodes can more efficiently compress so that the modulo-q sum is decoded with probability of error asymptotically approaching zero. It is both necessary and sufficient for nodes to send a smaller number of bits per sample than they would have to in order for all nodes to acquire all the data sequences in the network.

Description

Includes bibliographical references (p. 101-103).

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.

Date issued

2008

URI

http://hdl.handle.net/1721.1/44405

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses