Show simple item record

dc.contributor.advisorLi-Shiuan Peh.en_US
dc.contributor.authorKrishna, Tusharen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2014-06-13T22:33:03Z
dc.date.available2014-06-13T22:33:03Z
dc.date.copyright2014en_US
dc.date.issued2014en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/87926
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 197-204).en_US
dc.description.abstractAdding multiple processing cores on the same chip has become the de facto design choice as we continue extracting more and more performance/watt from our chips in every technology generation. In this context, the interconnect fabric connecting the cores starts gaining paramount importance. A high latency network can create performance bottlenecks and limit scalability. Thus conventional wisdom forces coherence protocol and software designers to develop techniques to optimize for locality and keep communication to the minimum. This dissertation challenges this conventional wisdom. We show that on-chip networks can be designed to provide extremely low-latencies while handling bursts of high-bandwidth traffic, thus reversing the trade-offs one typically associates with Private vs. Shared caches, or Broadcast vs. Directory protocols. The dissertation progressively builds a network-on-chip fabric that dynamically creates single-cycle network paths across multiple-hops, for both unicast and collective (1-to-Many and Many-to-1) communication flows. We start with a prototype chip demonstrating single-cycle per-hop traversals over a mesh network-on-chip. This design is then enhanced to support 1-to-Many (multicast) and Many-to-1 (acknowledgement) traffic flows by intelligent forking and aggregation respectively at network routers. Finally, we leverage clock-less repeated wires on the data-path and propose a dynamic cycle-by-cycle network reconfiguration methodology to provide single-cycle traversals across 9-11 hops at GHz frequencies. The network architectures proposed in this thesis provide performance that is within 12% of that provided by an idealized contention-free fully-connected single-cycle network. Going forward, we believe that the ideas proposed in this thesis can pave the way for locality-oblivious shared-memory design.en_US
dc.description.statementofresponsibilityby Tushar Krishna.en_US
dc.format.extent204 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleEnabling dedicated single-cycle connections over a shared network-on-chipen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc880140404en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record