Show simple item record

dc.contributor.authorOrenstein, Yaron
dc.contributor.authorWang, Yuhao
dc.contributor.authorBerger Leighton, Bonnie
dc.date.accessioned2018-05-23T17:12:48Z
dc.date.available2018-05-23T17:12:48Z
dc.date.issued2016-06
dc.identifier.issn1367-4803
dc.identifier.issn1460-2059
dc.identifier.urihttp://hdl.handle.net/1721.1/115819
dc.description.abstractMotivation: Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. Results: We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale.en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (Grant R01GM081871)en_US
dc.publisherOxford University Press (OUP)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1093/BIOINFORMATICS/BTW259en_US
dc.rightsCreative Commons Attribution-NonCommercial 4.0 Internationalen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en_US
dc.sourceOxford University Pressen_US
dc.titleRCK: accurate and efficient inference of sequence- and structure-based protein–RNA binding models from RNAcompete dataen_US
dc.typeArticleen_US
dc.identifier.citationOrenstein, Yaron, et al. “RCK: Accurate and Efficient Inference of Sequence- and Structure-Based Protein–RNA Binding Models from RNAcompete Data.” Bioinformatics, vol. 32, no. 12, June 2016, pp. i351–59.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorOrenstein, Yaron
dc.contributor.mitauthorWang, Yuhao
dc.contributor.mitauthorBerger Leighton, Bonnie
dc.relation.journalBioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2018-05-16T15:34:44Z
dspace.orderedauthorsOrenstein, Yaron; Wang, Yuhao; Berger, Bonnieen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-3583-3112
dc.identifier.orcidhttps://orcid.org/0000-0002-3430-6943
dc.identifier.orcidhttps://orcid.org/0000-0002-2724-7228
mit.licensePUBLISHER_CCen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record