Motifs, binding, and expression : computational studies of transcriptional regulation
Author(s)
MacIsaac, Kenzie D. (Kenzie Daniel), 1975-
DownloadFull printable version (5.003Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Ernest Fraenkel.
Terms of use
Metadata
Show full item recordAbstract
Organisms must control gene expression in response to developmental, nutritional, or other environmental cues. This process is known as transcriptional regulation and occurs through complex networks of proteins interacting with specific regulatory sites in the genome. Recently, high throughput variations of experimental techniques like transcriptional profiling and chromatin immunoprecipitation have emerged and taken on increasing importance in the study of regulatory processes. Mining these experiments for useful biological information requires methods that can handle large quantities of noisy data and integrate information from disparate experimental sources in a principled manner. Not coincidentally, computational and statistical methods for analyzing these data have increasingly become a focal point of research efforts. In this thesis we address three key challenges in the analysis of genomic sequence, protein localization, and expression data: (1) learning representations of the specific binding interactions that determine connectivity in regulatory networks, (2) developing physically grounded models describing these interactions, and (3) relating binding to its ultimate effect on the expression of regulated genes. To this end, we present several different algorithms and modeling techniques and apply them to real biological data in yeast, mouse, and human. Our results demonstrate the utility of leveraging multiple sources of information for improving motif analyses of chromatin immunoprecipitation data. Phylogenetic conservation information and knowledge of an immunoprecipitated protein's DNA binding domain are both shown to have great value in this context. (cont.) We next present a biophysically motivated framework for modeling protein-DNA interactions and show how it leads to very natural algorithms for analyzing the binding specificity of an immunoprecipitated protein, and jointly analyzing protein localization data for multiple regulators or multiple conditions. Finally, we present an analysis of transcriptional coregulator binding in a variety of mouse tissues and a method for predicting which proteins form complexes with the coregulator based purely on the sequence of the regions it binds. We detail a simple but powerful model relating regulator binding to gene expression, and show how the position of regulatory regions is of crucial importance for predicting the expression level of nearby genes.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (p. 141-150).
Date issued
2009Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.