GPU-accelerated Chemical Similarity Assessment for Large Scale Databases

Maggioni, Marco; Santambrogio, Marco Domenico; Liang, Jie

dc.contributor.author	Maggioni, Marco
dc.contributor.author	Santambrogio, Marco Domenico
dc.contributor.author	Liang, Jie
dc.date.accessioned	2014-12-12T19:16:42Z
dc.date.available	2014-12-12T19:16:42Z
dc.date.issued	2011
dc.identifier.issn	18770509
dc.identifier.uri	http://hdl.handle.net/1721.1/92298
dc.description.abstract	The assessment of chemical similarity between molecules is a basic operation in chemoinformatics, a computational area concerning with the manipulation of chemical structural information. Comparing molecules is the basis for a wide range of applications such as searching in chemical databases, training prediction models for virtual screening or aggregating clusters of similar compounds. However, currently available multimillion databases represent a challenge for conventional chemoinformatics algorithms raising the necessity for faster similarity methods. In this paper, we extensively analyze the advantages of using many-core architectures for calculating some commonly-used chemical similarity coe_cients such as Tanimoto, Dice or Cosine. Our aim is to provide a wide-breath proof-of-concept regarding the usefulness of GPU architectures to chemoinformatics, a class of computing problems still uncovered. In our work, we present a general GPU algorithm for all-to-all chemical comparisons considering both binary fingerprints and floating point descriptors as molecule representation. Subsequently, we adopt optimization techniques to minimize global memory accesses and to further improve e_ciency. We test the proposed algorithm on different experimental setups, a laptop with a low-end GPU and a desktop with a more performant GPU. In the former case, we obtain a 4-to-6-fold speed-up over a single-core implementation for fingerprints and a 4-to-7-fold speed-up for descriptors. In the latter case, we respectively obtain a 195-to-206-fold speed-up and a 100-to-328-fold speed-up.	en_US
dc.description.sponsorship	National Institutes of Health (U.S.) (grant GM079804)	en_US
dc.description.sponsorship	National Institutes of Health (U.S.) (grant GM086145)	en_US
dc.language.iso	en_US
dc.publisher	Elsevier B.V.	en_US
dc.relation.isversionof	http://dx.doi.org/10.1016/j.procs.2011.04.219	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/	en_US
dc.source	Elsevier	en_US
dc.title	GPU-accelerated Chemical Similarity Assessment for Large Scale Databases	en_US
dc.type	Article	en_US
dc.identifier.citation	Maggioni, Marco, Marco Domenico Santambrogio, and Jie Liang. “GPU-Accelerated Chemical Similarity Assessment for Large Scale Databases.” Procedia Computer Science 4 (2011): 2007–2016. © 2011 Elsevier B.V.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.mitauthor	Santambrogio, Marco Domenico	en_US
dc.relation.journal	Procedia Computer Science	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dspace.orderedauthors	Maggioni, Marco; Santambrogio, Marco Domenico; Liang, Jie	en_US
mit.license	PUBLISHER_CC	en_US
mit.metadata.status	Complete

Files in this item

Name:: Maggioni-2011-GPU-accelerated ...
Size:: 392.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record