Learning high-dimensional Markov forest distributions: Analysis of error rates

Tan, Vincent Y.F.; Anandkumar, Animashree; Willsky, Alan S.

Author(s)

Tan, Vincent Yan Fu; Anandkumar, Animashree; Willsky, Alan S.

DownloadWillsky-2011-Learning High.pdf (323.6Kb)

PUBLISHER_POLICY

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp., tree) model is the hardest (resp., easiest) to learn using the proposed algorithm in terms of error rates for structure learning.

Date issued

2011-05

URI

http://hdl.handle.net/1721.1/66514

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Laboratory for Information and Decision Systems

Journal

Journal of Machine Learning Research

Publisher

MIT Press

Citation

Tan, Vincent Y.F., Animashree Anandkumar and Alan S. Willsky. "Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates." Journal of Machine Learning Research, 12 (2011) 1617-1653.

Version: Final published version

ISSN

1532-4435

1533-7928

Collections

MIT Open Access Articles

DSpace@MIT