A Case for Fine-Grain Adaptive Cache Coherence

Kurian, George; Khan, Omer; Devadas, Srinivas

Author(s)

Kurian, George; Khan, Omer; Devadas, Srinivas

DownloadMIT-CSAIL-TR-2012-012.pdf (759.6Kb)

Other Contributors

Computation Structures

Advisor

Srini Devadas

Metadata

Show full item record

Abstract

As transistor density continues to grow geometrically, processor manufacturers are already able to place a hundred cores on a chip (e.g., Tilera TILE-Gx 100), with massive multicore chips on the horizon. Programmers now need to invest more effort in designing software capable of exploiting multicore parallelism. The shared memory paradigm provides a convenient layer of abstraction to the programmer, but will current memory architectures scale to hundreds of cores? This paper directly addresses the question of how to enable scalable memory systems for future multicores. We develop a scalable, efficient shared memory architecture that enables seamless adaptation between private and logically shared caching at the fine granularity of cache lines. Our data-centric approach relies on in hardware runtime profiling of the locality of each cache line and only allows private caching for data blocks with high spatio-temporal locality. This allows us to better exploit on-chip cache capacity and enable low-latency memory access in large-scale multicores.

Date issued

2012-05-22

URI

http://hdl.handle.net/1721.1/70909

Series/Report no.

MIT-CSAIL-TR-2012-012

Collections

CSAIL Technical Reports (July 1, 2003 - present)