Fast spectral primitives for directed graphs

Peebles, John Lee Thompson,Jr.

Author(s)

Peebles, John Lee Thompson,Jr.

Download1142190523-MIT.pdf (24.76Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Jonathan A. Kelner and Ronitt Rubinfeld.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

In this thesis, we study several algorithmic problems involving numerical linear algebra, probability, and statistics. Its main results include the following: -- We give the first nearly linear time algorithms for a large class of directed graph problems including computing the stationary distribution of a Markov chain with only a logarithmic dependence on the mixing time. Our approach is based on developing new spectral tools for directed graphs, including the first algorithms for sparsifying directed graphs and solving directed Laplacian linear systems. -- Symmetric diagonally dominant matrices frequently arise in science and engineering applications, often when one is discretization certain types of differential equations. We give faster algorithms for estimating the determinant of a symmetric diagonally dominant matrix and for sampling random spanning trees from a graph. --

Generative adversarial networks (GANs) are an important technique used in deep learning. However, the methods for training them are not satisfactorily understood from a theoretical perspective. To help improve this understanding, we devise a parametric problem which is sophisticated enough to capture many of the main difficulties associated with GAN training, yet simple enough to analyze rigorously. -- Fibonacci heaps are a data structure that provide a heap (a priority queue) and have optimal amortized runtimes for these operations. They are especially useful when one has to change the priorities of element many more times than one needs to remove elements from the data structure. We resolve two conjectures regarding the efficiency of variants of Fibonacci heaps, the first due to Karger and the second due to Fredman. --

Perhaps the most basic statistical question one can ask is how many samples of data one needs in order to check whether the data came from a hypothesis distribution or not. This is sometimes called goodness of fit testing in statistics or identity testing in computer science. We give the first algorithms that provably use as few samples as possible for these problems in all parameter regimes up to constant factors.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 331-345).

Date issued

2019

URI

https://hdl.handle.net/1721.1/124075

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses