Local to global geometric methods in information theory
Author(s)Abbe, Emmanuel Auguste
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Lizhong Zheng and Emre Telatar.
MetadataShow full item record
This thesis treats several information theoretic problems with a unified geometric approach. The development of this approach was motivated by the challenges encountered while working on these problems, and in turn, the testing of the initial tools to these problems suggested numerous refinements and improvements on the geometric methods. In ergodic probabilistic settings, Sanov's theorem gives asymptotic estimates on the probabilities of very rare events. The theorem also characterizes the exponential decay of the probabilities, as the sample size grows, and the exponential rate is given by the minimization of a certain divergence expression. In his seminal paper, A Mathematical Theory of Communication, Shannon introduced two influential ideas to simplify the complex task of evaluating the performance of a coding scheme: the asymptotic perspective (in the number of channel uses) and the random coding argument. In this setting, Sanov's theorem can be used to analyze ergodic information theoretic problems, and the performance of a coding scheme can be estimated by expressions involving the divergence. One would then like to use a geometric intuition to solve these problems, but the divergence is not a distance and our naive geometric intuition may lead to incorrect conclusions. In information geometry, a specific differential geometric structure is introduced by means of "dual affine connections". The approach we take in this thesis is slightly different and is based on introducing additional asymptotic regimes to analyze the divergence expressions. The following two properties play an important role. The divergence may not be a distance, but locally (i.e., when its arguments are "close to each other"), the divergence behaves like a squared distance.(cont.) Moreover, globally (i.e., when its arguments have no local restriction), it also preserves certain properties satisfied by squared distances. Therefore, we develop the Very Noisy and Hermite transformations, as techniques to map our global information theoretic problems in local ones. Through this localization, our global divergence expressions reduce in the limit to expressions defined in an inner product space. This provides us with a valuable geometric insight to the global problems, as well as a strong tool to find counter-examples. Finally, in certain cases, we have been able to "lift" results proven locally to results proven globally.(cont.) Therefore, we develop the Very Noisy and Hermite transformations, as techniques to map our global information theoretic problems in local ones. Through this localization, our global divergence expressions reduce in the limit to expressions defined in an inner product space. This provides us with a valuable geometric insight to the global problems, as well as a strong tool to find counter-examples. Finally, in certain cases, we have been able to "lift" results proven locally to results proven globally. We consider the following three problems. First, we address the problem of finding good linear decoders (maximizing additive metrics) for compound discrete memoryless channels. Known universal decoders are not linear and most of them heavily depend on the finite alphabet assumption. We show that by using a finite number of additive metrics, we can construct decoders that are universal (capacity achieving) on most compound sets. We then consider additive Gaussian noise channels. For a given perturbation of a Gaussian input distribution, we define an operator that measures how much variation is induced in the output entropy. We found that the singular functions of this operator are the Hermite polynomials, and the singular values are the powers of a signal to noise ratio. We show, in particular, how to use this structure on a Gaussian interference channel to characterize a regime where interference should not be treated as noise. Finally, we consider multi-input multi-output channels and discuss the properties of the optimal input distributions, for various random fading matrix ensembles. In particular, we prove Telatar's conjecture on the covariance structure minimizing the outage probability for output dimension one and input dimensions less than one hundred.
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 201-203).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.