Inferring interactions, expression programs and regulatory networks from high throughput biological data
Author(s)
Bar-Joseph, Ziv, 1971-
DownloadFull printable version (14.11Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
David K. Gifford and Tommi S. Jaakkola.
Terms of use
Metadata
Show full item recordAbstract
(cont.) For the networks level I present an algorithm that efficiently combines complementary large-scale expression and protein-DNA binding data to discover co-regulated modules of genes. This algorithm is extended so that it can infer sub-networks for specific systems in the cell. Finally, I present an algorithm which combines some of the above methods to automatically infer a dynamic sub-network for the cell cycle system. In this thesis I present algorithms for analyzing high throughput biological datasets. These algorithms work on a number of different analysis levels to infer interactions between genes, determine gene expression programs and model complex biological networks. Recent advances in high-throughput experimental methods in molecular biology hold great promise. DNA microarray technologies enable researchers to measure the expression levels of thousands of genes simultaneously. Time series expression data offers particularly rich opportunities for understanding the dynamics of biological processes. In addition to measuring expression data, microarrays have been recently exploited to measure genome-wide protein-DNA binding events. While these types of data are revolutionizing biology, they also present many computational challenges. Principled computational methods are required in order to make full use of each of these datasets, and to combine them to infer interactions and discover networks for modeling different systems in the cell. The algorithms presented in this thesis address three different analysis levels of high throughput biological data: Recovering individual gene values, pattern recognition and networks. For time series expression data, I present algorithms that permit the principled estimation of unobserved time-points, alignment and the identification of differentially expressed genes. For pattern recognition, I present algorithms for clustering continuous data, and for ordering the leaves of a clustering tree to infer expression programs.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. Includes bibliographical references (leaves 171-180).
Date issued
2003Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.