Large pseudocounts and L[subscript 2]-norm penalties are necessary for the mean-field inference of Ising and Potts models

Barton, J. P.; Cocco, S.; De Leonardis, E.; Monasson, R.

dc.contributor.author	Cocco, S.
dc.contributor.author	De Leonardis, E.
dc.contributor.author	Monasson, R.
dc.contributor.author	Barton, John P.
dc.date.accessioned	2015-06-17T14:53:26Z
dc.date.available	2015-06-17T14:53:26Z
dc.date.issued	2014-07
dc.date.submitted	2014-06
dc.identifier.issn	1539-3755
dc.identifier.issn	1550-2376
dc.identifier.uri	http://hdl.handle.net/1721.1/97450
dc.description.abstract	The mean-field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF methods achieve their best performance with strong regularization, well beyond Bayesian expectations, an empirical fact that is poorly understood. In this work, we study the influence of pseudocount and L[subscript 2]-norm regularization schemes on the quality of inferred Ising or Potts interaction networks from correlation data within the MF approximation. We argue, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation. Our claim is corroborated by extensive numerical studies of diverse model systems and by the analytical study of the m-component spin model for large but finite m. Additionally, we find that pseudocount regularization is robust against sampling noise and often outperforms L[subscript 2]-norm regularization, particularly when the underlying network of interactions is strongly heterogeneous. Much better performances are generally obtained for the Ising model than for the Potts model, for which only couplings incoming onto medium-frequency symbols are reliably inferred.	en_US
dc.description.sponsorship	France. Agence nationale de la recherche (Coevstat Project Grant ANR-13-BS04-0012-01)	en_US
dc.language.iso	en_US
dc.publisher	American Physical Society	en_US
dc.relation.isversionof	http://dx.doi.org/10.1103/PhysRevE.90.012132	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	American Physical Society	en_US
dc.title	Large pseudocounts and L[subscript 2]-norm penalties are necessary for the mean-field inference of Ising and Potts models	en_US
dc.type	Article	en_US
dc.identifier.citation	Barton, J. P., S. Cocco, E. De Leonardis, and R. Monasson. “Large Pseudocounts and L[subscript 2]-Norm Penalties Are Necessary for the Mean-Field Inference of Ising and Potts Models.” Phys. Rev. E 90, no. 1 (July 2014). © 2014 American Physical Society	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Chemical Engineering	en_US
dc.contributor.department	Ragon Institute of MGH, MIT and Harvard	en_US
dc.contributor.mitauthor	Barton, John P.	en_US
dc.relation.journal	Physical Review E	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dspace.orderedauthors	Barton, J. P.; Cocco, S.; De Leonardis, E.; Monasson, R.	en_US
dc.identifier.orcid	https://orcid.org/0000-0003-1467-421X
mit.license	PUBLISHER_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Barton-2014-Large pseudocounts.pdf
Size:: 2.369Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record