Model selection in compositional spaces
Author(s)
Grosse, Roger Baker
DownloadFull printable version (4.211Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
William T. Freeman.
Terms of use
Metadata
Show full item recordAbstract
We often build complex probabilistic models by composing simpler models-using one model to generate parameters or latent variables for another model. This allows us to express complex distributions over the observed data and to share statistical structure between dierent parts of a model. In this thesis, we present a space of matrix decomposition models defined by the composition of a small number of motifs of probabilistic modeling, including clustering, low rank factorizations, and binary latent factor models. This compositional structure can be represented by a context-free grammar whose production rules correspond to these motifs. By exploiting the structure of this grammar, we can generically and eciently infer latent components and estimate predictive likelihood for nearly 2500 model structures using a small toolbox of reusable algorithms. Using a greedy search over this grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs o gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code. We then consider several improvements to compositional structure search. We present compositional importance sampling (CIS), a novel procedure for marginal likelihood estimation which requires only posterior inference and marginal likelihood estimation algorithms corresponding to the production rules of the grammar. We analyze the performance of CIS in the case of identifying additional structure within a low-rank decomposition. This analysis yields insights into how one should design a space of models to be recursively searchable. We next consider the problem of marginal likelihood estimation for the production rules. We present a novel method for obtaining ground truth marginal likelihood values on synthetic data, which enables the rigorous quantitative comparison of marginal likelihood estimators. Using this method, we compare a wide variety of marginal likelihood estimators for the production rules of our grammar. Finally, we present a framework for analyzing the sequences of distributions used in annealed importance sampling, a state-of-the-art marginal likelihood estimator, and present a novel sequence of intermediate distributions based on averaging moments of the initial and target distributions.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 172-181).
Date issued
2014Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.