Show simple item record

dc.contributor.authorHuang, Katherine
dc.contributor.authorGevers, Dirk
dc.contributor.authorShea, Terrance
dc.contributor.authorYoung, Sarah
dc.contributor.authorCleary, Brian Lowman
dc.contributor.authorBrito, Ilana Lauren
dc.contributor.authorAlm, Eric J
dc.date.accessioned2017-01-20T21:20:49Z
dc.date.available2017-01-20T21:20:49Z
dc.date.issued2015-09
dc.date.submitted2014-10
dc.identifier.issn1087-0156
dc.identifier.issn1546-1696
dc.identifier.urihttp://hdl.handle.net/1721.1/106576
dc.description.abstractAnalyses of metagenomic datasets that are sequenced to a depth of billions or trillions of bases can uncover hundreds of microbial genomes, but naive assembly of these data is computationally intensive, requiring hundreds of gigabytes to terabytes of RAM. We present latent strain analysis (LSA), a scalable, de novo pre-assembly method that separates reads into biologically informed partitions and thereby enables assembly of individual genomes. LSA is implemented with a streaming calculation of unobserved variables that we call eigengenomes. Eigengenomes reflect covariance in the abundance of short, fixed-length sequences, or k-mers. As the abundance of each genome in a sample is reflected in the abundance of each k-mer in that genome, eigengenome analysis can be used to partition reads from different genomes. This partitioning can be done in fixed memory using tens of gigabytes of RAM, which makes assembly and downstream analyses of terabytes of data feasible on commodity hardware. Using LSA, we assemble partial and near-complete genomes of bacterial taxa present at relative abundances as low as 0.00001%. We also show that LSA is sensitive enough to separate reads from several strains of the same species.en_US
dc.description.sponsorshipRasmussen Family Foundationen_US
dc.description.sponsorshipNational Human Genome Research Institute (U.S.) (Grant U54HG003067)en_US
dc.description.sponsorshipMassachusetts Institute of Technology. Center for Environmental Health Sciencesen_US
dc.description.sponsorshipColumbia Earth Instituteen_US
dc.language.isoen_US
dc.publisherNature Publishing Groupen_US
dc.relation.isversionofhttp://dx.doi.org/10.1038/nbt.3329en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourcePMCen_US
dc.titleDetection of low-abundance bacterial strains in metagenomic datasets by eigengenome partitioningen_US
dc.typeArticleen_US
dc.identifier.citationCleary, Brian et al. “Detection of Low-Abundance Bacterial Strains in Metagenomic Datasets by Eigengenome Partitioning.” Nature Biotechnology 33.10 (2015): 1053–1060.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computational and Systems Biology Programen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Biological Engineeringen_US
dc.contributor.mitauthorCleary, Brian Lowman
dc.contributor.mitauthorBrito, Ilana Lauren
dc.contributor.mitauthorAlm, Eric J
dc.relation.journalNature Biotechnologyen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsCleary, Brian; Brito, Ilana Lauren; Huang, Katherine; Gevers, Dirk; Shea, Terrance; Young, Sarah; Alm, Eric Jen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0003-0825-7129
dc.identifier.orcidhttps://orcid.org/0000-0001-8294-9364
mit.licensePUBLISHER_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record