Show simple item record

dc.contributor.authorWang, Yiqiu
dc.contributor.authorYu, Shangdi
dc.contributor.authorGu, Yan
dc.contributor.authorShun, Julian
dc.date.accessioned2022-10-24T17:13:00Z
dc.date.available2022-07-20T15:09:25Z
dc.date.available2022-10-24T17:13:00Z
dc.date.issued2021
dc.identifier.urihttps://hdl.handle.net/1721.1/143884.2
dc.description.abstractThis paper presents new parallel algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN$^*$). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN$^*$. We also present a parallel approximate algorithm for OPTICS based on a recent sequential algorithm by Gan and Tao. Finally, we give a new parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN$^*$. We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time). We implement our algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.en_US
dc.language.isoen
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.isversionof10.1145/3448016.3457296en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceACMen_US
dc.titleFast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clusteringen_US
dc.typeArticleen_US
dc.identifier.citationWang, Yiqiu, Yu, Shangdi, Gu, Yan and Shun, Julian. 2021. "Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering." Proceedings of the 2021 International Conference on Management of Data.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journalProceedings of the 2021 International Conference on Management of Dataen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2022-07-20T15:01:02Z
dspace.orderedauthorsWang, Y; Yu, S; Gu, Y; Shun, Jen_US
dspace.date.submission2022-07-20T15:01:03Z
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

VersionItemDateSummary

*Selected version