Extracting transcriptional regulatory information from DNA microarray expression data
Author(s)Schmitt, William A. (William Anthony), 1976-
Massachusetts Institute of Technology. Dept. of Chemical Engineering.
MetadataShow full item record
(cont.) As a model system, we have chosen the unicellular, photoautotrophic cyanobacteria Synechocystis sp. PCC6803 for study, as it is 1) fully sequenced, 2) has an easily manipulated input signal (light for photosynthesis), and 3) fixes carbon dioxide into the commercially interesting, biodegradable polymer polyhydroxyalkanoate (PHA). We have created DNA microarrays with [approximately]97% of the Synechocystis genome represented in duplicate to monitor the cellular transcriptional profile. These arrays are used in time-series experiments of differing light levels to measure dynamic transcriptional response to changing environmental conditions. We have developed networks of potential genetic regulatory interactions through time-series analysis based on the data from our studies. An algorithm for combining gene position information, clustering, and time-lagged correlations has been created to generate networks of hypothetical biological links. Analysis of these networks indicates that good correlation exists between the input signal and certain groups of photosynthesis- and metabolism-related genes. Furthermore, this analysis technique placed these in a temporal context, showing the sequence of potential effects from changes in the experimental conditions. This data and hypothetical interaction networks have been used to construct AutoRegressive with eXogenous input (ARX) models. These provide dynamic, state-space models for prediction of transcriptional profiles given a dynamically changing set of environmental perturbations...Recent technological developments allow all the genes of a species to be monitored simultaneously at the transcriptional level. This necessitates a more global approach to biology that includes consideration of complex interactions between many genes and other intracellular species. The metaphor of a cell as a miniature chemical plant with inputs, outputs, and controls gives chemical engineers a foothold in this type of analysis. Networks of interacting genes are fertile ground for the application of the methods developed by engineers for the analysis and monitoring of industrial chemical processes. The DNA microarray has been established as a tool for efficient collection of mRNA expression data for a large number of genes simultaneously. Although great strides have been made in the methodology and instrumentation of this technique, the development of computational tools needed to interpret the results have received relatively inadequate attention. Existing analyses, such a clustering techniques applied to static data from cells at many different states, provide insight into co-expression of genes and are an important basis for exploration of the cell's genetic programming. We propose that an even greater level of regulatory detail may be gained by dynamically changing experimental conditions (the input signal) and measuring the time-delayed response of the genes (the output signal). The addition of temporal information to DNA microarray experiments should suggest potential cause/effect relationships among genes with significant regulatory responses to the conditions of interest. This thesis aims to develop computational techniques to maximize the information gained from such dynamic experiments.
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemical Engineering, 2003.Includes bibliographical references.
DepartmentMassachusetts Institute of Technology. Dept. of Chemical Engineering.
Massachusetts Institute of Technology