Show simple item record

dc.contributor.authorOrenstein, Yaron
dc.contributor.authorBerger Leighton, Bonnie
dc.date.accessioned2018-04-03T16:43:24Z
dc.date.available2018-04-03T16:43:24Z
dc.date.issued2015-12
dc.identifier.issn1066-5277
dc.identifier.issn1557-8666
dc.identifier.urihttp://hdl.handle.net/1721.1/114507
dc.description.abstractCurrent microarray technologies to determine RNA structure or measure protein-RNA interactions rely on single-stranded, unstructured RNA probes on a chip covering together all k-mers. Since space on the array is limited, the problem is to efficiently design a compact library of unstructured ℓ-long RNA probes, where each k-mer is covered at least p times. Ray et al. designed such a library for specific values of k, ℓ, and p using ad-hoc rules. To our knowledge, there is no general method to date to solve this problem. Here, we address the problem of finding a minimum-size covering of all k-mers by ℓ-long sequences with the desired properties for any value of k, ℓ, and p. As we prove that the problem is NP-hard, we give two solutions: the first is a greedy algorithm with a logarithmic approximation ratio; the second, a heuristic greedy approach based on random walks in de Bruijn graphs. The heuristic algorithm works well in practice and produces a library of unstructured RNA probes that is only ∼1.1-times greater in size compared to the theoretical lower bound. We present results for typical values of k and probe lengths ℓ and show that our algorithm generates a library that is significantly smaller than the library of Ray et al.; moreover, we show that our algorithm outperforms naive methods. Our approach can be generalized and extended to generate RNA or DNA oligo libraries with other desired properties. The software is freely available online. Keywords: de Bruijn graph; microarray library design; RNA secondary structureen_US
dc.description.sponsorshipUnited States. National Institutes of Health (Grant R01GM081871)en_US
dc.publisherMary Ann Liebert Incen_US
dc.relation.isversionofhttp://dx.doi.org/10.1089/CMB.2015.0179en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceMary Ann Lieberten_US
dc.titleEfficient Design of Compact Unstructured RNA Libraries Covering All k-mersen_US
dc.typeArticleen_US
dc.identifier.citationOrenstein, Yaron, and Bonnie Berger. “Efficient Design of Compact Unstructured RNA Libraries Covering Allk-Mers.” Journal of Computational Biology 23, 2 (February 2016) © 2015 The Author(s)en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorOrenstein, Yaron
dc.contributor.mitauthorBerger Leighton, Bonnie
dc.relation.journalJournal of Computational Biologyen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2018-02-23T15:28:19Z
dspace.orderedauthorsOrenstein, Yaron; Berger, Bonnieen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-3583-3112
dc.identifier.orcidhttps://orcid.org/0000-0002-2724-7228
mit.licensePUBLISHER_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record