Sampling in computer vision and Bayesian nonparametric mixtures

Chang, Jason, Ph. D. Massachusetts Institute of Technology

Author(s)

Chang, Jason, Ph. D. Massachusetts Institute of Technology

DownloadFull printable version (7.031Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

John W. Fisher, III.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

The field of computer vision focuses on understanding and reasoning about the visual world. Due to the complexity of this problem, researchers often focus on one specific component of this large task, such as segmentation or recognition. This modularized approach necessitates the combination of each separate component, which Bayesian formulations handle in a mathematically consistent framework. Unfortunately, probabilistic formulations are often difficult in computer vision due to the complexity and large dimensionality of data. In this thesis, we demonstrate how efficient Markov chain Monte Carlo (MCMC) sampling techniques can address a subset of these problems. In the first half of this thesis, we consider the problem of inference in discrete Markov random fields (MRFs) that often occur in segmentation and tracking. We develop the Permutation-based Gibbs-Inspired Metropolis-Hasting (PGIMH) sampling algorithm and show its applicability to a variety of formulations (including curve-length penalties and topology priors). In particle filtering, PGIMH precludes the need to update particle weights or use of sequential importance resampling. Empirical results demonstrate that PGIMH is approximately 104 times faster than previous shape sampling approaches and that it improves results in segmentation, boundary detection, and object tracking. In the second half of this thesis, we focus on inference in the Dirichlet process mixture model (DPMM), which is often slow and cumbersome due to the infinite number of mixture components. We develop a parallel algorithm that samples from the posterior distribution of a DPMM without requiring finite model approximations. This method, called DP Sub-Clusters, essentially fits a two-component mixture model to each regular cluster. These \sub-clusters" are then used to propose splits and merges, resulting in the efficient exploration of the sample space. We show how the developed framework extends to other mixture models, such as the hierarchical Dirichlet process, often used in document analysis. Additionally, we develop the spatially-varying Dirichlet process Gaussian mixture model (SV-DPGMM), which achieves state-of-the-art results in intrinsic image decomposition by leveraging the DP Sub-Cluster algorithm. By addressing these problems, we demonstrate the applicability of MCMC methods to computer vision, and highlight the importance of designing fast sampling algorithms.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

128

Cataloged from student-submitted PDF version of thesis.

Includes bibliographical references (pages 213-221).

Date issued

2014

URI

http://hdl.handle.net/1721.1/91042

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses