SC²EPTON : high-performance and scalable, low-power and intelligent, ordered Mesh on-chip network
Author(s)
Daya, Bhavya Kishor
DownloadFull printable version (27.68Mb)
Alternative title
High-performance and scalable, low-power and intelligent, ordered Mesh on-chip network
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Terms of use
Metadata
Show full item recordAbstract
Over the last few decades, hindrances to performance and voltage scaling led to a shift from uniprocessors to multicore processors, to the point where the on-chip interconnect plays a larger role in achieving the desired performance and power goals. Shared memory multicores are subject to data sharing concerns as each processor computes on data locally, and needs to be aware of accesses by other cores. Hardware cache coherence addresses the problem, and provides superior performance to software-implemented coherence, but is limited within practical constraints, i.e. area, power, timing. Scaling coherence to higher core counts, presents challenges of unscalable storage, high power consumption, and increased on-chip network traffic. SC²EPTON targets the three challenges with three on-chip networks - SCORPIO, SCEPTER, SB² . SCORPIO addresses the unscalable storage plaguing directory-based coherence, with a 36-core chip prototype showcasing a novel distributed global ordering mechanism to support snoopy coherence over scalable mesh networks. Although the downsides of a directory are averted, the network itself consumes a significant fraction of the total chip power, of which the router buffer power dominates. SCEPTER is a bufferless mesh NoC that reduces the network power consumption, and achieves high performance by intelligently prioritizing, routing, and throttling flits to maximize opportunities to bypass on dynamically set, virtual single-cycle express paths. For unicast communication, SCEPTER performs on-par with state-of-the-art buffered networks, however broadcasts exacerbate the link contention at bisection and ejection links, limiting performance gains. SB² addresses the broadcast traffic in bufferless NoCs with a TDM-based embedded ring architecture that dynamically determines ring access, allows multiple sources simultaneous contention-free access, and sets the control path locally at each node within the same cycle. The three NoCs contribute key elements to the SC²EPTON architecture, resulting in a low-power and high-performance bufferless snoopy coherent mesh network.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. Cataloged from PDF version of thesis. Includes bibliographical references (pages 156-162).
Date issued
2015Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.