Show simple item record

dc.contributor.authorDaya, Bhavya Kishor
dc.contributor.authorChen, Chia-Hsin
dc.contributor.authorSubramanian, Suvinay
dc.contributor.authorKwon, Woo Cheol
dc.contributor.authorPark, Sunghyun
dc.contributor.authorKrishna, Tushar
dc.contributor.authorHolt, Jim
dc.contributor.authorChandrakasan, Anantha P.
dc.contributor.authorPeh, Li-Shiuan
dc.date.accessioned2014-06-30T16:37:34Z
dc.date.available2014-06-30T16:37:34Z
dc.date.issued2014-06-30
dc.identifier.issn1063-6897
dc.identifier.urihttp://hdl.handle.net/1721.1/88132
dc.descriptionURL to conference programen_US
dc.description.abstractIn the many-core era, scalable coherence and on-chip interconnects are crucial for shared memory processors. While snoopy coherence is common in small multicore systems, directory-based coherence is the de facto choice for scalability to many cores, as snoopy relies on ordered interconnects which do not scale. However, directory-based coherence does not scale beyond tens of cores due to excessive directory area overhead or inaccurate sharer tracking. Prior techniques supporting ordering on arbitrary unordered networks are impractical for full multicore chip designs. We present SCORPIO, an ordered mesh Network-on-Chip(NoC) architecture with a separate fixed-latency, bufferless network to achieve distributed global ordering. Message delivery is decoupled from the ordering, allowing messages to arrive in any order and at any time, and still be correctly ordered. The architecture is designed to plug-and-play with existing multicore IP and with practicality, timing, area, and power as top concerns. Full-system 36 and 64-core simulations on SPLASH-2 and PARSEC benchmarks show an average application run time reduction of 24.1% and 12.9%, in comparison to distributed directory and AMD HyperTransport coherence protocols, respectively. The SCORPIO architecture is incorporated in an 11 mm-by- 13 mm chip prototype, fabricated in IBM 45nm SOI technology, comprising 36 Freescale e200 Power Architecture TM cores with private L1 and L2 caches interfacing with the NoC via ARM AMBA, along with two Cadence on-chip DDR2 controllers. The chip prototype achieves a post synthesis operating frequency of 1 GHz (833 MHz post-layout) with an estimated power of 28.8 W (768 mW per tile), while the network consumes only 10% of tile area and 19 % of tile power.en_US
dc.description.sponsorshipUnited States. Defense Advanced Research Projects Agency (DARPA UHPC grant at MIT (Angstrom))en_US
dc.description.sponsorshipCenter for Future Architectures Researchen_US
dc.description.sponsorshipMicroelectronics Advanced Research Corporation (MARCO)en_US
dc.description.sponsorshipSemiconductor Research Corporationen_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://cag.engr.uconn.edu/isca2014/program.htmlen_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleSCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Orderingen_US
dc.typeArticleen_US
dc.identifier.citationDaya, Bhavya K., Chia-Hsin Owen Chen, Suvinay Subramanian, Woo-Cheol Kwon, Sunghyun Park, Tushar Krishna, Jim Holt, Anantha P. Chandrakasan, and Li-Shiuan Peh. "SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering." 41st International Symposium on Computer Architecture, ISCA 2014, Minneapolis, MN, USA, June 14-18, 2014.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.approverPeh, Li-Shiuanen_US
dc.contributor.mitauthorDaya, Bhavya Kishoren_US
dc.contributor.mitauthorChen, Chia-Hsinen_US
dc.contributor.mitauthorSubramanian, Suvinayen_US
dc.contributor.mitauthorKwon, Woo Cheolen_US
dc.contributor.mitauthorPark, Sunghyunen_US
dc.contributor.mitauthorKrishna, Tusharen_US
dc.contributor.mitauthorHolt, Jimen_US
dc.contributor.mitauthorChandrakasan, Anantha P.en_US
dc.contributor.mitauthorPeh, Li-Shiuanen_US
dc.relation.journalProceedings of the 41st International Symposium on Computer Architecture, ISCA 2014en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsDaya, Bhavya K.; Chen, Chia-Hsin Owen; Subramanian, Suvinay; Kwon, Woo Cheol; Park, Sunghyun; Krishna, Tushar; Holt, Jim; Chandrakasan, Anantha P.; Peh, Li-Shiuanen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-2345-5791
dc.identifier.orcidhttps://orcid.org/0000-0001-9010-6519
dc.identifier.orcidhttps://orcid.org/0000-0002-3383-1535
dc.identifier.orcidhttps://orcid.org/0000-0001-7701-8303
dc.identifier.orcidhttps://orcid.org/0000-0002-5977-2748
dc.identifier.orcidhttps://orcid.org/0000-0003-1284-6620
dspace.mitauthor.errortrue
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record