Biochemically informed modeling of miRNA targeting efficacy
Author(s)
Lin, Kathy S.
Download1237267260-MIT.pdf (26.54Mb)
Other Contributors
Massachusetts Institute of Technology. Computational and Systems Biology Program.
Advisor
David P. Bartel.
Terms of use
Metadata
Show full item recordAbstract
In metazoans, microRNAs (miRNAs) are short pieces of RNA that load into Argonaute (AGO) proteins and base-pair to complementary sequences in mRNAs. Upon binding an mRNA, AGO- miRNA complexes recruit machinery that translationally represses and degrades the mRNAs. Mammalian genomes encode hundreds of miRNAs, and most mRNAs in mammals have evolutionarily conserved target sites to at least one of these miRNAs. Because of the widespread and varied roles of miRNAs in regulating gene expression, there have been many efforts over the past decade to predict the extent of targeting between a miRNA and an mRNA from their sequences alone. This targeting relationship between a miRNA and an mRNA depends on the binding affinities for the AGO-miRNA complex to target sites on the mRNA, which are poorly predicted by nearest-neighbor rules used for predicting RNA-RNA duplex stabilities. This is presumably because AGO modulates the energetics of duplexes formed between its loaded miRNA and mRNA target sites. The recent development of a high-throughput method of measuring RNA-binding affinities, RNA bind-n-seq (RBNS), has allowed us to determine the relative KD values for AGO-miRNA complexes binding to hundreds of thousands of potential target sites. In this work, we use these biochemical parameters to build a quantitative model of miRNA targeting that predicts mRNA repression by a miRNA in cells better than existing in silico models. We then expand this approach to all miRNAs, including those for which we have not measured binding affinities for, by training a convolutional neural network (CNN) to predict the binding affinity between arbitrary miRNA and target sequences. We show that CNN-predicted KD values parallel the utility of experimentally determined KD values in predicting the repression of mRNAs in cells. By measuring the binding affinities between miRNAs and their targets, we can also estimate how much binding affinity contributes to miRNA-mediated targeting. Although the majority of the variance in targeting is attributable to binding affinity, about 40% of the variance remains unexplained, motivating future efforts to expand the deep learning framework to learn important features of mRNAs outside of target sites that influence miRNA activity.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Computational and Systems Biology Program, February, 2020 Cataloged from student-submitted PDF of thesis. Vita. Includes bibliographical references.
Date issued
2020Department
Massachusetts Institute of Technology. Computational and Systems Biology ProgramPublisher
Massachusetts Institute of Technology
Keywords
Computational and Systems Biology Program.