MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Information Theory of DNA Shotgun Sequencing

Author(s)
Motahari, Abolfazl S.; Bresler, Guy; Tse, David N. C.
Thumbnail
DownloadBresler_Information theory.pdf (413.6Kb)
OPEN_ACCESS_POLICY

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. A basic question is: given a sequencing technology and the statistics of the DNA sequence, what is the minimum number of reads required for reliable reconstruction? This number provides a fundamental limit to the performance of any assembly algorithm. For a simple statistical model of the DNA sequence and the read process, we show that the answer admits a critical phenomenon in the asymptotic limit of long DNA sequences: if the read length is below a threshold, reconstruction is impossible no matter how many reads are observed, and if the read length is above the threshold, having enough reads to cover the DNA sequence is sufficient to reconstruct. The threshold is computed in terms of the Renyi entropy rate of the DNA sequence. We also study the impact of noise in the read process on the performance.
Date issued
2013-10
URI
http://hdl.handle.net/1721.1/110778
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
IEEE Transactions on Information Theory
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Motahari, Abolfazl S.; Bresler, Guy and Tse, David N. C. “Information Theory of DNA Shotgun Sequencing.” IEEE Transactions on Information Theory 59, 10 (October 2013): 6273–6289 © 2013 Institute of Electrical and Electronics Engineers (IEEE)
Version: Original manuscript
ISSN
0018-9448
1557-9654

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.