| dc.contributor.author | Luo, Yunan | |
| dc.contributor.author | Yu, Yun William | |
| dc.contributor.author | Zeng, Jianyang | |
| dc.contributor.author | Berger Leighton, Bonnie | |
| dc.contributor.author | Peng, Jian | |
| dc.date.accessioned | 2019-11-08T18:08:50Z | |
| dc.date.available | 2019-11-08T18:08:50Z | |
| dc.date.issued | 2018-07-13 | |
| dc.date.submitted | 2018-06 | |
| dc.identifier.issn | 1367-4803 | |
| dc.identifier.issn | 1460-2059 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/122806 | |
| dc.description.abstract | Motivation: Vastly greater quantities of microbial genome data are being generated where environmental samples mix together the DNA from many different species. Here, we present Opal for metagenomic binning, the task of identifying the origin species of DNA sequencing reads. We introduce low-density' locality sensitive hashing to bioinformatics, with the addition of Gallager codes for even coverage, enabling quick and accurate metagenomic binning. Results: On public benchmarks, Opal halves the error on precision/recall (F1-score) as compared with both alignment-based and alignment-free methods for species classification. We demonstrate even more marked improvement at higher taxonomic levels, allowing for the discovery of novel lineages. Furthermore, the innovation of low-density, even-coverage hashing should itself prove an essential methodological advance as it enables the application of machine learning to other bioinformatic challenges. Availability and implementation: Full source code and datasets are available at http://opal.csail.mit.edu and https://github.com/yunwilliamyu/opal. Supplementary information: Supplementary data are available at Bioinformatics online. | en_US |
| dc.description.sponsorship | National Institutes of Health (U.S.) (Grant GM108348) | en_US |
| dc.language.iso | en | |
| dc.publisher | Oxford University Press (OUP) | en_US |
| dc.relation.isversionof | http://dx.doi.org/10.1093/bioinformatics/bty611 | en_US |
| dc.rights | Creative Commons Attribution NonCommercial License 4.0 | en_US |
| dc.rights.uri | https://creativecommons.org/licenses/by-nc/4.0/ | en_US |
| dc.source | Oxford University Press | en_US |
| dc.title | Metagenomic binning through low-density hashing | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Lou, Yunan, et al. "Metagenomic binning through low-density hashing." Bioinformatics 35, 2, (January 2019): 219–226 © 2018 The Author(s) | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Mathematics | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | en_US |
| dc.relation.journal | Bioinformatics | en_US |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dc.date.updated | 2019-11-07T19:05:01Z | |
| dspace.date.submission | 2019-11-07T19:05:04Z | |
| mit.journal.volume | 35 | en_US |
| mit.journal.issue | 2 | en_US |