Show simple item record

dc.contributor.advisorDavid K. Gifford.en_US
dc.contributor.authorGerber, Georg Kurt, 1970-en_US
dc.contributor.otherHarvard University--MIT Division of Health Sciences and Technology.en_US
dc.date.accessioned2008-09-03T14:52:40Z
dc.date.available2008-09-03T14:52:40Z
dc.date.copyright2007en_US
dc.date.issued2007en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/42201
dc.descriptionThesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2007.en_US
dc.descriptionIncludes bibliographical references (p. 163-181).en_US
dc.description.abstractHigh-throughput molecular data are revolutionizing biology by providing massive amounts of information about gene expression and regulation. Such information is applicable both to furthering our understanding of fundamental biology and to developing new diagnostic and treatment approaches for diseases. However, novel mathematical methods are needed for extracting biological knowledge from high-dimensional, complex and noisy data sources. In this thesis, I develop and apply three novel computational approaches for this task. The common theme of these approaches is that they seek to discover meaningful groups of genes, which confer robustness to noise and compress complex information into interpretable models. I first present the GRAM algorithm, which fuses information from genome-wide expression and in vivo transcription factor-DNA binding data to discover regulatory networks of gene modules. I use the GRAM algorithm to discover regulatory networks in Saccharomyces cerevisiae, including rich media, rapamycin, and cell-cycle module networks. I use functional annotation databases, independent biological experiments and DNA-motif information to validate the discovered networks, and to show that they yield new biological insights. Second, I present GeneProgram, a framework based on Hierarchical Dirichlet Processes, which uses large compendia of mammalian expression data to simultaneously organize genes into overlapping programs and tissues into groups to produce maps of expression programs. I demonstrate that GeneProgram outperforms several popular analysis methods, and using mouse and human expression data, show that it automatically constructs a comprehensive, body-wide map of inter-species expression programs.en_US
dc.description.abstract(cont.) Finally, I present an extension of GeneProgram that models temporal dynamics. I apply the algorithm to a compendium of short time-series gene expression experiments in which human cells were exposed to various infectious agents. I show that discovered expression programs exhibit temporal pattern usage differences corresponding to classes of host cells and infectious agents, and describe several programs that implicate surprising signaling pathways and receptor types in human responses to infection.en_US
dc.description.statementofresponsibilityby Georg Kurt Gerber.en_US
dc.format.extent181 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectHarvard University--MIT Division of Health Sciences and Technology.en_US
dc.titleComputational discovery of gene modules, regulatory networks and expression programsen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentHarvard University--MIT Division of Health Sciences and Technology
dc.identifier.oclc230820346en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record