Author(s)Yu, Xiangyao; Liu, Hongzhe; Zou, Ethan; Devadas, Srinivas
MetadataShow full item record
Cache coherence scalability is a big challenge in shared memory systems. Traditional protocols do not scale due to the storage and traffic overhead of cache invalidation. Tardis, a recently proposed coherence protocol, removes cache invalidation using logical timestamps and achieves excellent scalability. The original Tardis protocol, however, only supports the Sequential Consistency (SC) memory model, limiting its applicability. Tardis also incurs extra network traffic on some benchmarks due to renew messages, and has suboptimal performance when the program uses spinning to communicate between threads. In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement the TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also describe modifications for Partial Store Order (PSO) and Release Consistency (RC). Finally, we propose optimizations for better leasing policies and to handle program spinning. On a set of benchmarks, optimized Tardis improves on a full-map directory protocol in the metrics of performance, storage and network traffic, while being simpler to implement.
DepartmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation - PACT '16
Association for Computing Machinery (ACM)
Yu, Xiangyao, et al. Tardis 2.0: "Optimized Time Traveling Coherence for Relaxed Consistency Models." PACT '16 Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 11-15 September, 2016, Haifa, Israel, ACM Press, 2016, pp. 261–74.
Author's final manuscript