| dc.contributor.author | Wanye, Frank | |
| dc.contributor.author | Gleyzer, Vitaliy | |
| dc.contributor.author | Kao, Edward | |
| dc.contributor.author | Feng, Wu-chun | |
| dc.date.accessioned | 2025-10-07T20:35:34Z | |
| dc.date.available | 2025-10-07T20:35:34Z | |
| dc.date.issued | 2025-07-20 | |
| dc.identifier.isbn | 979-8-4007-1869-4 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/163070 | |
| dc.description | HPDC ’25, Notre Dame, IN, USA | en_US |
| dc.description.abstract | Stochastic block partitioning (SBP) is a statistical inference-based
algorithm for clustering vertices within a graph. It has been shown
to be statistically robust and highly accurate even on graphs with
a complex structure, but its poor scalability limits its usability to
smaller-sized graphs. In this manuscript we argue that one reason
for its poor scalability is the agglomerative, or bottom-up, nature
of SBP’s algorithmic design; the agglomerative computations cause
high memory usage and create a large search space that slows
down statistical inference, particularly in the algorithm’s initial
iterations. To address this bottleneck, we propose Top-Down SBP, a
novel algorithm that replaces the agglomerative (bottom-up) block
merges in SBP with a block-splitting operation. This enables the
algorithm to start with all vertices in one cluster and subdivide
them over time into smaller clusters. We show that Top-Down
SBP is up to 7.7× faster than Bottom-Up SBP without sacrificing
accuracy and can process larger graphs than Bottom-Up SBP on
the same hardware due to an up to 4.1× decrease in memory usage.
Additionally, we adapt existing methods for accelerating BottomUp SBP to the Top-Down approach, leading to up to 13.2× speedup
over accelerated Bottom-Up SBP and up to 403× speedup over
sequential Bottom-Up SBP on 64 compute nodes. Thus, Top-Down
SBP represents substantial improvements to the scalability of SBP,
enabling the analysis of larger datasets on the same hardware. | en_US |
| dc.publisher | ACM|The 34th International Symposium on High-Performance Parallel and Distributed Computing | en_US |
| dc.relation.isversionof | https://doi.org/10.1145/3731545.3731589 | en_US |
| dc.rights | Creative Commons Attribution | en_US |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
| dc.source | Association for Computing Machinery | en_US |
| dc.title | Top-Down SBP: Turning Graph Clustering Upside Down | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Wanye, Frank, Gleyzer, Vitaliy, Kao, Edward and Feng, Wu-chun. 2025. "Top-Down SBP: Turning Graph Clustering Upside Down." | |
| dc.contributor.department | Lincoln Laboratory | en_US |
| dc.identifier.mitlicense | PUBLISHER_POLICY | |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
| dc.date.updated | 2025-10-01T07:47:07Z | |
| dc.language.rfc3066 | en | |
| dc.rights.holder | The author(s) | |
| dspace.date.submission | 2025-10-01T07:47:07Z | |
| mit.license | PUBLISHER_CC | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US |