Simple, Fast, Scalable, and Reliable Multiprocessor Algorithms
Author(s)
Jayanti, Siddhartha Visveswara
DownloadThesis PDF (2.190Mb)
Advisor
Shun, Julian
Terms of use
Metadata
Show full item recordAbstract
In this thesis, I identify simplicity, speed, scalability, and reliability as four core design goals for multiprocessor algorithms, and design and analyze algorithms that meet these goals.
I design the first scalable algorithm for concurrent union-find. Our algorithm provides almost-linear speed-up, performing just [formula] work when p processes execute a total of m operations on an instance with n nodes. I furnish the algorithm with a rigorous, machine-verified proof of correctness, and prove that its work-complexity is optimal amongst a class of symmetric algorithms, which captures the complexities of all known concurrent union-find algorithms. The algorithm is lightning quick in practice: it has improved the state-of-the-art in model checking [Bloemen] and spatial clustering [Wang et al.], and is the fastest algorithm for computing connected components on both CPUs and GPUs [Dhulipala et al., Hong et al.].
I introduce concurrent fast arrays, which are linearizable wait-free arrays that support all operations, including initialization, in just constant time. As an application, I design the first fixed-length fast hash table, which supports constant time initialization, insertions, and queries.
I define సామాన్య జాగృతి (generalized wake-up), which generalizes the information propagation problem called wake-up. I prove fundamental hardness results about this problem, and through reductions, show that any linearizable queue, stack, priority queue, counter, or union-find object's work complexity must increase with process count; these lower bounds are robust to both randomization and amortization. This thesis includes the original results in Telugu with Sanskrit abstract, along with their English translation.
I design optimal complexity locks for real-time and persistent memory systems. Our abortable queue lock is the first abortable lock to achieve O(1) amortized RMR complexity for both cache-coherent (CC) and distributed shared memory (DSM) systems. It additionally provides "abortable first-come-first-served'' fairness and supports "fast aborts''. Our recoverable queue lock is the first recoverable lock to achieve the optimal O(log p/ log log p) worst-case RMR complexity on both CC and DSM persistent memory systems. Both locks are innovations on our newly devised standard lock, whose design simplifies and unifies several previously known techniques.
This thesis also emphasizes rigorous guarantees for concurrent algorithms. I devise a novel universal, sound, and complete "tracking'' technique for proving linearizable and strong linearizable correctness of concurrent algorithms. My collaborators and I have used this technique to give machine-verified proofs of correctness for multicore queue, union-find, and snapshot algorithms.
Finally, I prove and experimentally validate that asynchronous "HOGWILD!'' Gibbs Sampling, a technique born from machine learning practice, can be used to accurately estimate expectations of polynomial and other statistics of graphical models satisfying Dobrushin's condition.
Date issued
2023-02Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology