Fast spectral primitives for directed graphs
Author(s)
Peebles, John Lee Thompson,Jr.
Download1142190523-MIT.pdf (24.76Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Jonathan A. Kelner and Ronitt Rubinfeld.
Terms of use
Metadata
Show full item recordAbstract
In this thesis, we study several algorithmic problems involving numerical linear algebra, probability, and statistics. Its main results include the following: -- We give the first nearly linear time algorithms for a large class of directed graph problems including computing the stationary distribution of a Markov chain with only a logarithmic dependence on the mixing time. Our approach is based on developing new spectral tools for directed graphs, including the first algorithms for sparsifying directed graphs and solving directed Laplacian linear systems. -- Symmetric diagonally dominant matrices frequently arise in science and engineering applications, often when one is discretization certain types of differential equations. We give faster algorithms for estimating the determinant of a symmetric diagonally dominant matrix and for sampling random spanning trees from a graph. -- Generative adversarial networks (GANs) are an important technique used in deep learning. However, the methods for training them are not satisfactorily understood from a theoretical perspective. To help improve this understanding, we devise a parametric problem which is sophisticated enough to capture many of the main difficulties associated with GAN training, yet simple enough to analyze rigorously. -- Fibonacci heaps are a data structure that provide a heap (a priority queue) and have optimal amortized runtimes for these operations. They are especially useful when one has to change the priorities of element many more times than one needs to remove elements from the data structure. We resolve two conjectures regarding the efficiency of variants of Fibonacci heaps, the first due to Karger and the second due to Fredman. -- Perhaps the most basic statistical question one can ask is how many samples of data one needs in order to check whether the data came from a hypothesis distribution or not. This is sometimes called goodness of fit testing in statistics or identity testing in computer science. We give the first algorithms that provably use as few samples as possible for these problems in all parameter regimes up to constant factors.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from PDF version of thesis. Includes bibliographical references (pages 331-345).
Date issued
2019Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.