dc.contributor.author | Beckmann, Nathan Zachary | |
dc.contributor.author | Tsai, Po-An | |
dc.contributor.author | Sanchez, Daniel | |
dc.date.accessioned | 2015-02-26T13:37:58Z | |
dc.date.available | 2015-02-26T13:37:58Z | |
dc.date.issued | 2015-02 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/95648 | |
dc.description.abstract | Cache hierarchies are increasingly non-uniform, so for systems to scale efficiently, data must be close to the threads that use it. Moreover, cache capacity is limited and contended among threads, introducing complex capacity/latency tradeoffs. Prior NUCA schemes have focused on managing data to reduce access latency, but have ignored thread placement; and applying prior NUMA thread placement schemes to NUCA is inefficient, as capacity, not bandwidth, is the main constraint. We present CDCS, a technique to jointly place threads and data in multicores with distributed shared caches. We develop novel monitoring hardware that enables fine-grained space allocation on large caches, and data movement support to allow frequent full-chip reconfigurations. On a 64-core system, CDCS outperforms an S-NUCA LLC by 46% on average (up to 76%) in weighted speedup and saves 36% of system energy. CDCS also outperforms state-of-the-art NUCA schemes under different thread scheduling policies. | en_US |
dc.description.sponsorship | National Science Foundation (U.S.) (Grant CCF-1318384) | en_US |
dc.description.sponsorship | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Jacobs Presidential Fellowship) | en_US |
dc.description.sponsorship | United States. Defense Advanced Research Projects Agency (PERFECT Contract HR0011-13-2-0005) | en_US |
dc.language.iso | en_US | |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
dc.relation.isversionof | http://darksilicon.org/hpca/?page_id=53 | en_US |
dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
dc.source | MIT web domain | en_US |
dc.title | Scaling Distributed Cache Hierarchies through Computation and Data Co-Scheduling | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Beckmann, Nathan, Po-An Tsai, and Daniel Sanchez. "Scaling Distributed Cache Hierarchies through Computation and Data Co-Scheduling." 21st IEEE Symposium on High Performance Computer Architecture (February 2015). | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.contributor.mitauthor | Beckmann, Nathan Zachary | en_US |
dc.contributor.mitauthor | Tsai, Po-An | en_US |
dc.contributor.mitauthor | Sanchez, Daniel | en_US |
dc.relation.journal | Proceedings of the 21st IEEE Symposium on High Performance Computer Architecture | en_US |
dc.eprint.version | Author's final manuscript | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dspace.orderedauthors | Beckmann, Nathan; Tsai, Po-An; Sanchez, Daniel | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-2453-2904 | |
dc.identifier.orcid | https://orcid.org/0000-0002-6057-9769 | |
dc.identifier.orcid | https://orcid.org/0000-0003-4561-6450 | |
mit.license | OPEN_ACCESS_POLICY | en_US |
mit.metadata.status | Complete | |