RNA : algorithms, evolution and design
Author(s)
Schnall-Levin, Michael (Michael Benjamin)
DownloadFull printable version (5.408Mb)
Alternative title
Ribonucleic acid
Other Contributors
Massachusetts Institute of Technology. Dept. of Mathematics.
Advisor
Bonnie Berger.
Terms of use
Metadata
Show full item recordAbstract
Modern biology is being remade by a dizzying array of new technologies, a deluge of data, and an increasingly strong reliance on computation to guide and interpret experiments. In two areas of biology, computational methods have become central: predicting and designing the structure of biological molecules and inferring function from molecular evolution. In this thesis, I develop a number of algorithms for problems in these areas and combine them with experiment to provide biological insight. First, I study the problem of designing RNA sequences that fold into specific structures. To do so I introduce a novel computational problem on Hidden Markov Models (HMMs) and Stochastic Context Free Grammars (SCFGs). I show that the problem is NP-hard, resolving an open question for RNA secondary structure design, and go on to develop a number of approximation approaches. I then turn to the problem of inferring function from evolution. I develop an algorithm to identify regions in the genome that are serving two simultaneous functions: encoding a protein and encoding regulatory information. I first use this algorithm to find microRNA targets in both Drosophila and mammalian genes and show that conserved microRNA targeting in coding regions is widespread. Next, I identify a novel phenomenon where an accumulation of sequence repeats leads to surprisingly strong microRNA targeting, demonstrating a previously unknown role for such repeats. Finally, I address the problem of detecting more general conserved regulatory elements in coding DNA. I show that such elements are widespread in Drosophila and can be identified with high confidence, a result with important implications for understanding both biological regulation and the evolution of protein coding sequences.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2011. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student submitted PDF version of thesis. Includes bibliographical references (p. 205-214).
Date issued
2011Department
Massachusetts Institute of Technology. Department of MathematicsPublisher
Massachusetts Institute of Technology
Keywords
Mathematics.