Show simple item record

dc.contributor.authorLuo, Yunan
dc.contributor.authorYu, Yun William
dc.contributor.authorZeng, Jianyang
dc.contributor.authorBerger Leighton, Bonnie
dc.contributor.authorPeng, Jian
dc.date.accessioned2019-11-08T18:08:50Z
dc.date.available2019-11-08T18:08:50Z
dc.date.issued2018-07-13
dc.date.submitted2018-06
dc.identifier.issn1367-4803
dc.identifier.issn1460-2059
dc.identifier.urihttps://hdl.handle.net/1721.1/122806
dc.description.abstractMotivation: Vastly greater quantities of microbial genome data are being generated where environmental samples mix together the DNA from many different species. Here, we present Opal for metagenomic binning, the task of identifying the origin species of DNA sequencing reads. We introduce low-density' locality sensitive hashing to bioinformatics, with the addition of Gallager codes for even coverage, enabling quick and accurate metagenomic binning. Results: On public benchmarks, Opal halves the error on precision/recall (F1-score) as compared with both alignment-based and alignment-free methods for species classification. We demonstrate even more marked improvement at higher taxonomic levels, allowing for the discovery of novel lineages. Furthermore, the innovation of low-density, even-coverage hashing should itself prove an essential methodological advance as it enables the application of machine learning to other bioinformatic challenges. Availability and implementation: Full source code and datasets are available at http://opal.csail.mit.edu and https://github.com/yunwilliamyu/opal. Supplementary information: Supplementary data are available at Bioinformatics online.en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (Grant GM108348)en_US
dc.language.isoen
dc.publisherOxford University Press (OUP)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1093/bioinformatics/bty611en_US
dc.rightsCreative Commons Attribution NonCommercial License 4.0en_US
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/en_US
dc.sourceOxford University Pressen_US
dc.titleMetagenomic binning through low-density hashingen_US
dc.typeArticleen_US
dc.identifier.citationLou, Yunan, et al. "Metagenomic binning through low-density hashing." Bioinformatics 35, 2, (January 2019): 219–226 © 2018 The Author(s)en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.relation.journalBioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2019-11-07T19:05:01Z
dspace.date.submission2019-11-07T19:05:04Z
mit.journal.volume35en_US
mit.journal.issue2en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record