Brief announcement: Distributed shared memory based on computation migration
Author(s)Lis, Mieszko; Shim, Keun Sup; Cho, Myong Hyon; Fletcher, Christopher Wardlaw; Kinsy, Michel A.; Lebedev, Ilia A.; Khan, Omer; Devadas, Srinivas; ... Show more Show less
MetadataShow full item record
Driven by increasingly unbalanced technology scaling and power dissipation limits, microprocessor designers have resorted to increasing the number of cores on a single chip, and pundits expect 1000-core designs to materialize in the next few years . But how will memory architectures scale and how will these next-generation multicores be programmed? One barrier to scaling current memory architectures is the offchip memory bandwidth wall [1,2]: off-chip bandwidth grows with package pin density, which scales much more slowly than on-die transistor density . To reduce reliance on external memories and keep data on-chip, today’s multicores integrate very large shared last-level caches on chip ; interconnects used with such shared caches, however, do not scale beyond relatively few cores, and the power requirements and access latencies of large caches exclude their use in chips on a 1000-core scale. For massive-scale multicores, then, we are left with relatively small per-core caches. Per-core caches on a 1000-core scale, in turn, raise the question of memory coherence. On the one hand, a shared memory abstraction is a practical necessity for general-purpose programming, and most programmers prefer a shared memory model . On the other hand, ensuring coherence among private caches is an expensive proposition: bus-based and snoopy protocols don’t scale beyond relatively few cores, and directory sizes needed in cache-coherence protocols must equal a significant portion of the combined size of the per-core caches as otherwise directory evictions will limit performance . Moreover, directory-based coherence protocols are notoriously difficult to implement and verify .
DepartmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '11)
Association for Computing Machinery (ACM)
Mieszko Lis, Keun Sup Shim, Myong Hyon Cho, Christopher W. Fletcher, Michel Kinsy, Ilia Lebedev, Omer Khan, and Srinivas Devadas. 2011. Brief announcement: distributed shared memory based on computation migration. In Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures (SPAA '11). ACM, New York, NY, USA, 253-256.
Author's final manuscript