Show simple item record

dc.contributor.authorMotahari, Abolfazl S.
dc.contributor.authorBresler, Guy
dc.contributor.authorTse, David N. C.
dc.date.accessioned2017-07-19T18:40:47Z
dc.date.available2017-07-19T18:40:47Z
dc.date.issued2013-10
dc.date.submitted2013-05
dc.identifier.issn0018-9448
dc.identifier.issn1557-9654
dc.identifier.urihttp://hdl.handle.net/1721.1/110778
dc.description.abstractDNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. A basic question is: given a sequencing technology and the statistics of the DNA sequence, what is the minimum number of reads required for reliable reconstruction? This number provides a fundamental limit to the performance of any assembly algorithm. For a simple statistical model of the DNA sequence and the read process, we show that the answer admits a critical phenomenon in the asymptotic limit of long DNA sequences: if the read length is below a threshold, reconstruction is impossible no matter how many reads are observed, and if the read length is above the threshold, having enough reads to cover the DNA sequence is sufficient to reconstruct. The threshold is computed in terms of the Renyi entropy rate of the DNA sequence. We also study the impact of noise in the read process on the performance.en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/tit.2013.2270273en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleInformation Theory of DNA Shotgun Sequencingen_US
dc.typeArticleen_US
dc.identifier.citationMotahari, Abolfazl S.; Bresler, Guy and Tse, David N. C. “Information Theory of DNA Shotgun Sequencing.” IEEE Transactions on Information Theory 59, 10 (October 2013): 6273–6289 © 2013 Institute of Electrical and Electronics Engineers (IEEE)en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorBresler, Guy
dc.relation.journalIEEE Transactions on Information Theoryen_US
dc.eprint.versionOriginal manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsMotahari, Abolfazl S.; Bresler, Guy; Tse, David N. C.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0003-1303-582X
mit.licenseOPEN_ACCESS_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record