Extending expectation propagation for graphical models

Qi, Yuan, 1974-

Author(s)

Qi, Yuan, 1974-

DownloadFull printable version (7.882Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Architecture. Program In Media Arts and Sciences

Advisor

Rosalind W. Picard.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/30215 http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Graphical models have been widely used in many applications, ranging from human behavior recognition to wireless signal detection. However, efficient inference and learning techniques for graphical models are needed to handle complex models, such as hybrid Bayesian networks. This thesis proposes extensions of expectation propagation, a powerful generalization of loopy belief propagation, to develop efficient Bayesian inference and learning algorithms for graphical models. The first two chapters of the thesis present inference algorithms for generative graphical models, and the next two propose learning algorithms for conditional graphical models. First, the thesis proposes a window-based EP smoothing algorithm for online estimation on hybrid dynamic Bayesian networks. For an application in wireless communications, window-based EP smoothing achieves estimation accuracy comparable to sequential Monte Carlo methods, but with less than one-tenth computational cost. Second, it develops a new method that combines tree-structured EP approximations with the junction tree for inference on loopy graphs. This new method saves computation and memory by propagating messages only locally to a subgraph when processing each edge in the entire graph. Using this local propagation scheme, this method is not only more accurate, but also faster than loopy belief propagation and structured variational methods. Third, it proposes predictive automatic relevance determination (ARD) to enhance classification accuracy in the presence of irrelevant features. ARD is a Bayesian technique for feature selection.

(cont.) The thesis discusses the overfitting problem associated with ARD, and proposes a method that optimizes the estimated predictive performance, instead of maximizing the model evidence. For a gene expression classification problem, predictive ARD outperforms previous methods, including traditional ARD as well as support vector machines combined with feature selection techniques. Finally, it presents Bayesian conditional random fields (BCRFs) for classifying interdependent and structured data, such as sequences, images or webs. BCRFs estimate the posterior distribution of model parameters and average prediction over this posterior to avoid overfitting. For the problems of frequently-asked-question labeling and of ink recognition, BCRFs achieve superior prediction accuracy over conditional random fields trained with maximum likelihood and maximum a posteriori criteria.

Description

Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.

Includes bibliographical references (p. 101-106).

Date issued

2005

URI

http://dspace.mit.edu/handle/1721.1/30215
http://hdl.handle.net/1721.1/30215

Department

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Publisher

Massachusetts Institute of Technology

Keywords

Architecture. Program In Media Arts and Sciences

Collections

Doctoral Theses