Show simple item record

dc.contributor.authorShajii, Ariya
dc.contributor.authorYorukoglu, Deniz
dc.contributor.authorYu, Yun William
dc.contributor.authorBerger Leighton, Bonnie
dc.date.accessioned2018-05-17T19:13:46Z
dc.date.available2018-05-17T19:13:46Z
dc.date.issued2016-08
dc.identifier.issn1367-4803
dc.identifier.issn1460-2059
dc.identifier.urihttp://hdl.handle.net/1721.1/115481
dc.description.abstractMotivation: As the volume of next-generation sequencing (NGS) data increases, faster algorithms become necessary. Although speeding up individual components of a sequence analysis pipeline (e.g. read mapping) can reduce the computational cost of analysis, such approaches do not take full advantage of the particulars of a given problem. One problem of great interest, genotyping a known set of variants (e.g. dbSNP or Affymetrix SNPs), is important for characterization of known genetic traits and causative disease variants within an individual, as well as the initial stage of many ancestral and population genomic pipelines (e.g. GWAS). Results: We introduce lightweight assignment of variant alleles (LAVA), an NGS-based genotyping algorithm for a given set of SNP loci, which takes advantage of the fact that approximate matching of mid-size k-mers (with k = 32) can typically uniquely ide ntify loci in the human genome without full read alignment. LAVA accurately calls the vast majority of SNPs in dbSNP and Affymetrix's Genome-Wide Human SNP Array 6.0 up to about an order of magnitude faster than standard NGS genotyping pipelines. For Affymetrix SNPs, LAVA has significantly higher SNP calling accuracy than existing pipelines while using as low as ∼5 GB of RAM. As such, LAVA represents a scalable computational method for population-level genotyping studies as well as a flexible NGS-based replacement for SNP arrays. Availability and Implementation: LAVA software is available at http://lava.csail.mit.edu.en_US
dc.publisherOxford University Press (OUP)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1093/BIOINFORMATICS/BTW460en_US
dc.rightsCreative Commons Attribution-NonCommercial 4.0 Internationalen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en_US
dc.sourceOxford University Pressen_US
dc.titleFast genotyping of known SNPs through approximateen_US
dc.typeArticleen_US
dc.identifier.citationShajii, Ariya et al. “Fast Genotyping of Known SNPs through Approximatek-Mer Matching.” Bioinformatics 32, 17 (September 2016): i538–i544 © 2016 The Authorsen_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorYorukoglu, Deniz
dc.contributor.mitauthorYu, Yun William
dc.contributor.mitauthorBerger Leighton, Bonnie
dc.relation.journalBioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2018-05-16T15:37:39Z
dspace.orderedauthorsShajii, Ariya; Yorukoglu, Deniz; William Yu, Yun; Berger, Bonnieen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0003-2315-0768
dc.identifier.orcidhttps://orcid.org/0000-0002-8275-9576
dc.identifier.orcidhttps://orcid.org/0000-0002-2724-7228
mit.licensePUBLISHER_CCen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record