Biochemically informed modeling of miRNA targeting efficacy

Lin, Kathy S.

Author(s)

Lin, Kathy S.

Download1237267260-MIT.pdf (26.54Mb)

Other Contributors

Massachusetts Institute of Technology. Computational and Systems Biology Program.

Advisor

David P. Bartel.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

In metazoans, microRNAs (miRNAs) are short pieces of RNA that load into Argonaute (AGO) proteins and base-pair to complementary sequences in mRNAs. Upon binding an mRNA, AGO- miRNA complexes recruit machinery that translationally represses and degrades the mRNAs. Mammalian genomes encode hundreds of miRNAs, and most mRNAs in mammals have evolutionarily conserved target sites to at least one of these miRNAs. Because of the widespread and varied roles of miRNAs in regulating gene expression, there have been many efforts over the past decade to predict the extent of targeting between a miRNA and an mRNA from their sequences alone. This targeting relationship between a miRNA and an mRNA depends on the binding affinities for the AGO-miRNA complex to target sites on the mRNA, which are poorly predicted by nearest-neighbor rules used for predicting RNA-RNA duplex stabilities.

This is presumably because AGO modulates the energetics of duplexes formed between its loaded miRNA and mRNA target sites. The recent development of a high-throughput method of measuring RNA-binding affinities, RNA bind-n-seq (RBNS), has allowed us to determine the relative KD values for AGO-miRNA complexes binding to hundreds of thousands of potential target sites. In this work, we use these biochemical parameters to build a quantitative model of miRNA targeting that predicts mRNA repression by a miRNA in cells better than existing in silico models. We then expand this approach to all miRNAs, including those for which we have not measured binding affinities for, by training a convolutional neural network (CNN) to predict the binding affinity between arbitrary miRNA and target sequences. We show that CNN-predicted KD values parallel the utility of experimentally determined KD values in predicting the repression of mRNAs in cells.

By measuring the binding affinities between miRNAs and their targets, we can also estimate how much binding affinity contributes to miRNA-mediated targeting. Although the majority of the variance in targeting is attributable to binding affinity, about 40% of the variance remains unexplained, motivating future efforts to expand the deep learning framework to learn important features of mRNAs outside of target sites that influence miRNA activity.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Computational and Systems Biology Program, February, 2020

Cataloged from student-submitted PDF of thesis. Vita.

Includes bibliographical references.

Date issued

2020

URI

https://hdl.handle.net/1721.1/129906

Department

Massachusetts Institute of Technology. Computational and Systems Biology Program

Publisher

Massachusetts Institute of Technology

Keywords

Computational and Systems Biology Program.

Collections

Doctoral Theses