On the Entropy of Protein Families
Author(s)
Cocco, Simona; Jacquin, Hugo; Monasson, Rémi; Chakraborty, Arup K; Barton, John P.
Download10955_2015_Article_1441.pdf (2.015Mb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Proteins are essential components of living systems, capable of performing a huge variety of tasks at the molecular level, such as recognition, signalling, copy, transport, ... The protein sequences realizing a given function may largely vary across organisms, giving rise to a protein family. Here, we estimate the entropy of those families based on different approaches, including Hidden Markov Models used for protein databases and inferred statistical models reproducing the low-order (1- and 2-point) statistics of multi-sequence alignments. We also compute the entropic cost, that is, the loss in entropy resulting from a constraint acting on the protein, such as the mutation of one particular amino-acid on a specific site, and relate this notion to the escape probability of the HIV virus. The case of lattice proteins, for which the entropy can be computed exactly, allows us to provide another illustration of the concept of cost, due to the competition of different folds. The relevance of the entropy in relation to directed evolution experiments is stressed.
Date issued
2016-01Department
Massachusetts Institute of Technology. Institute for Medical Engineering & Science; Massachusetts Institute of Technology. Department of Biological Engineering; Massachusetts Institute of Technology. Department of Chemical Engineering; Massachusetts Institute of Technology. Department of Chemistry; Massachusetts Institute of Technology. Department of Physics; Ragon Institute of MGH, MIT and HarvardJournal
Journal of Statistical Physics
Publisher
Springer-Verlag
Citation
Barton, John P., Arup K. Chakraborty, Simona Cocco, Hugo Jacquin, and Rémi Monasson. “On the Entropy of Protein Families.” Journal of Statistical Physics, vol. 162, no. 5, January 2016, pp.1267–1293.
Version: Author's final manuscript
ISSN
0022-4715
1572-9613