Comprehensive variation discovery in single human genomes
Author(s)
Weisenfeld, Neil I; Yin, Shuangye; Sharpe, Ted; Lau, Bayo; Hegarty, Ryan; Holmes, Laurie; Sogoloff, Brian; Tabbaa, Diana; Williams, Louise; Russ, Carsten; Nusbaum, Chad; MacCallum, Iain; Jaffe, David B.; Lander, Eric Steven; ... Show more Show less
DownloadLander_Comprehensive variation.pdf (240.4Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome.
Date issued
2014-10Department
Massachusetts Institute of Technology. Department of BiologyJournal
Nature Genetics
Publisher
Nature Publishing Group
Citation
Weisenfeld, Neil I, Shuangye Yin, Ted Sharpe, Bayo Lau, Ryan Hegarty, Laurie Holmes, Brian Sogoloff, et al. “Comprehensive Variation Discovery in Single Human Genomes.” Nature Genetics 46, no. 12 (October 19, 2014): 1350–1355.
Version: Author's final manuscript
ISSN
1061-4036
1546-1718