Show simple item record

dc.contributor.advisorBruce Tidor.en_US
dc.contributor.authorKing, Bracken Mathenyen_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Biological Engineering.en_US
dc.date.accessioned2011-02-23T14:19:40Z
dc.date.available2011-02-23T14:19:40Z
dc.date.copyright2010en_US
dc.date.issued2010en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/61143
dc.descriptionThesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biological Engineering, 2010.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (p. 115-123).en_US
dc.description.abstractThe identification and quantification of high-dimensional relationships is a major challenge in the analysis of both biological and chemical systems. To address this challenge, a variety of experimental and computational tools have been developed to generate multivariate samples from these systems. Information theory provides a general framework for the analysis of such data, but for many applications, the large sample sizes needed to reliably compute high-dimensional information theoretic statistics are not available. In this thesis we develop, validate, and apply a novel framework for approximating high-dimensional information theoretic statistics using associated terms of arbitrarily low order. For a variety of synthetic, biological, and chemical systems, we find that these low-order approximations provide good estimates of higher-order multivariate relationships, while dramatically reducing the number of samples needed to reach convergence. We apply the framework to the analysis of multiple biological systems, including a phospho-proteomic data set in which we identify a subset of phospho-peptides that is maximally informative of cellular response (migration and proliferation) across multiple conditions (varying EGF or heregulin stimulation, and HER2 expression). This subset is shown to produce statistical models with superior performance to those built with subsets of similar size. We also employ the framework to extract configurational entropies from molecular dynamics simulations of a series of small molecules, demonstrating improved convergence relative to existing methods. As these disparate applications highlight, our framework enables the use of general information theoretic phrasings even in systems where data quantities preclude direct estimation of the high-order statistics. Furthermore, because the framework provides a hierarchy of approximations of increasing order, as data collection and analysis techniques improve, the method extends to generate more accurate results, while maintaining the same underlying theory.en_US
dc.description.statementofresponsibilityby Bracken Matheny King.en_US
dc.format.extent123 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectBiological Engineering.en_US
dc.titleAnalysis of biological and chemical systems using information theoretic approximationsen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Biological Engineering
dc.identifier.oclc698095409en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record