Show simple item record

dc.contributor.authorPuccinelli, Robert
dc.contributor.authorKim, Ryan
dc.contributor.authorFordyce, Polly
dc.contributor.authorOrenstein, Yaron
dc.contributor.authorBerger Leighton, Bonnie
dc.date.accessioned2018-05-16T13:17:47Z
dc.date.available2018-05-16T13:17:47Z
dc.date.issued2017-09
dc.date.submitted2017-04
dc.identifier.issn2405-4712
dc.identifier.urihttp://hdl.handle.net/1721.1/115384
dc.description.abstractSequence libraries that cover all k-mers enable universal, unbiased measurements of binding to both oligonucleotides and peptides. While the number of k-mers grows exponentially in k, space on all experimental platforms is limited. Here, we shrink k-mer library sizes by using joker characters, which represent all characters in the alphabet simultaneously. We present the JokerCAKE (joker covering all k-mers) algorithm for generating a short sequence such that each k-mer appears at least p times with at most one joker character per k-mer. By running our algorithm on a range of parameters and alphabets, we show that JokerCAKE produces near-optimal sequences. Moreover, through comparison with data from hundreds of DNA-protein binding experiments and with new experimental results for both standard and JokerCAKE libraries, we establish that accurate binding scores can be inferred for high-affinity k-mers using JokerCAKE libraries. JokerCAKE libraries allow researchers to search a significantly larger sequence space using the same number of experimental measurements and at the same cost. We present a new compact sequence design that covers all k-mers utilizing joker characters and develop an efficient algorithm to generate such designs. We show through simulations and experimental validation that these sequence designs are useful for identifying high-affinity binding sites at significantly reduced cost and space. Keywords: sequence libraries; microarray design; de Bruijn graphen_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (Grant R01GM081871)en_US
dc.publisherElsevieren_US
dc.relation.isversionofhttp://dx.doi.org/10.1016/J.CELS.2017.07.006en_US
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivs Licenseen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/en_US
dc.sourceElsevieren_US
dc.titleOptimized Sequence Library Design for Efficient In Vitro Interaction Mappingen_US
dc.typeArticleen_US
dc.identifier.citationOrenstein, Yaron et al. “Optimized Sequence Library Design for Efficient In Vitro Interaction Mapping.” Cell Systems 5, 3 (September 2017): 230–236 © 2017 The Authorsen_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorOrenstein, Yaron
dc.contributor.mitauthorBerger Leighton, Bonnie
dc.relation.journalCell Systemsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2018-05-15T18:30:46Z
dspace.orderedauthorsOrenstein, Yaron; Puccinelli, Robert; Kim, Ryan; Fordyce, Polly; Berger, Bonnieen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-3583-3112
dc.identifier.orcidhttps://orcid.org/0000-0002-2724-7228
mit.licensePUBLISHER_CCen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record