| dc.contributor.author | Maggioni, Marco |  | 
| dc.contributor.author | Santambrogio, Marco Domenico |  | 
| dc.contributor.author | Liang, Jie |  | 
| dc.date.accessioned | 2014-12-12T19:16:42Z |  | 
| dc.date.available | 2014-12-12T19:16:42Z |  | 
| dc.date.issued | 2011 |  | 
| dc.identifier.issn | 18770509 |  | 
| dc.identifier.uri | http://hdl.handle.net/1721.1/92298 |  | 
| dc.description.abstract | The assessment of chemical similarity between molecules is a basic operation in chemoinformatics, a computational area concerning with the manipulation of chemical structural information. Comparing molecules is the basis for a wide range of applications such as searching in chemical databases, training prediction models for virtual screening or aggregating clusters of similar compounds. However, currently available multimillion databases represent a challenge for conventional chemoinformatics algorithms raising the necessity for faster similarity methods. In this paper, we extensively analyze the advantages of using many-core architectures for calculating some commonly-used chemical similarity coe_cients such as Tanimoto, Dice or Cosine. Our aim is to provide a wide-breath proof-of-concept regarding the usefulness of GPU architectures to chemoinformatics, a class of computing problems still uncovered. In our work, we present a general GPU algorithm for all-to-all chemical comparisons considering both binary fingerprints and floating point descriptors as molecule representation. Subsequently, we adopt optimization techniques to minimize global memory accesses and to further improve e_ciency. We test the proposed algorithm on different experimental setups, a laptop with a low-end GPU and a desktop with a more performant GPU. In the former case, we obtain a 4-to-6-fold speed-up over a single-core implementation for fingerprints and a 4-to-7-fold speed-up for descriptors. In the latter case, we respectively obtain a 195-to-206-fold speed-up and a 100-to-328-fold speed-up. | en_US | 
| dc.description.sponsorship | National Institutes of Health (U.S.) (grant GM079804) | en_US | 
| dc.description.sponsorship | National Institutes of Health (U.S.) (grant GM086145) | en_US | 
| dc.language.iso | en_US |  | 
| dc.publisher | Elsevier B.V. | en_US | 
| dc.relation.isversionof | http://dx.doi.org/10.1016/j.procs.2011.04.219 | en_US | 
| dc.rights | Creative Commons Attribution | en_US | 
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/ | en_US | 
| dc.source | Elsevier | en_US | 
| dc.title | GPU-accelerated Chemical Similarity Assessment for Large Scale Databases | en_US | 
| dc.type | Article | en_US | 
| dc.identifier.citation | Maggioni, Marco, Marco Domenico Santambrogio, and Jie Liang. “GPU-Accelerated Chemical Similarity Assessment for Large Scale Databases.” Procedia Computer Science 4 (2011): 2007–2016. © 2011 Elsevier B.V. | en_US | 
| dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | en_US | 
| dc.contributor.mitauthor | Santambrogio, Marco Domenico | en_US | 
| dc.relation.journal | Procedia Computer Science | en_US | 
| dc.eprint.version | Final published version | en_US | 
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US | 
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US | 
| dspace.orderedauthors | Maggioni, Marco; Santambrogio, Marco Domenico; Liang, Jie | en_US | 
| mit.license | PUBLISHER_CC | en_US | 
| mit.metadata.status | Complete |  |