Bayesian time series models and scalable inference

Johnson, Matthew James, Ph. D. Massachusetts Institute of Technology

Author(s)

Johnson, Matthew James, Ph. D. Massachusetts Institute of Technology

DownloadFull printable version (18.22Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Alan S. Willsky.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

With large and growing datasets and complex models, there is an increasing need for scalable Bayesian inference. We describe two lines of work to address this need. In the first part, we develop new algorithms for inference in hierarchical Bayesian time series models based on the hidden Markov model (HMM), hidden semi-Markov model (HSMM), and their Bayesian nonparametric extensions. The HMM is ubiquitous in Bayesian time series models, and it and its Bayesian nonparametric extension, the hierarchical Dirichlet process hidden Markov model (HDP-HMM), have been applied in many settings. HSMMs and HDP-HSMMs extend these dynamical models to provide state-specific duration modeling, but at the cost of increased computational complexity for inference, limiting their general applicability. A challenge with all such models is scaling inference to large datasets. We address these challenges in several ways. First, we develop classes of duration models for which HSMM message passing complexity scales only linearly in the observation sequence length. Second, we apply the stochastic variational inference (SVI) framework to develop scalable inference for the HMM, HSMM, and their nonparametric extensions. Third, we build on these ideas to define a new Bayesian nonparametric model that can capture dynamics at multiple timescales while still allowing efficient and scalable inference. In the second part of this thesis, we develop a theoretical framework to analyze a special case of a highly parallelizable sampling strategy we refer to as Hogwild Gibbs sampling. Thorough empirical work has shown that Hogwild Gibbs sampling works very well for inference in large latent Dirichlet allocation models (LDA), but there is little theory to understand when it may be effective in general. By studying Hogwild Gibbs applied to sampling from Gaussian distributions we develop analytical results as well as a deeper understanding of its behavior, including its convergence and correctness in some regimes.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 197-206).

Date issued

2014

URI

http://hdl.handle.net/1721.1/89993

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses