Show simple item record

dc.contributor.advisorSrinivas Devadas.en_US
dc.contributor.authorLis, Mieszko N. (Mieszko Norbert), 1977-en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2015-01-20T17:59:31Z
dc.date.available2015-01-20T17:59:31Z
dc.date.copyright2014en_US
dc.date.issued2014en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/93066
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 109-113).en_US
dc.description.abstractAlthough thread migration has long been employed to satisfy load-balancing goals in operating systems for symmetric multiprocessing hardware, the high cost of OS-mediated migration has made more fine-grained applications impractical. With only a few cores per processor, and high overheads due to moving threads across processors and loss of cache affinity, assigning threads to specific processor cores for long periods has remained the default strategy for ensuring maximum performance. Massive-scale single-chip multiprocessors dramatically alter this picture. On-chip data transfer latencies-even across a 100+-core chip-rarely exceed tens of cycles, making the potential cost of thread migration as low as executing several instructions. At the same time, all cores are placed on the same die and typically share one last-level cache distributed on chip, obviating cache affinity concerns. In this dissertation, we explore the limits of fine-grained thread migration by developing an autonomous mechanism for migrating threads implemented entirely in hardware. We then employ migration to implement the unified shared memory abstraction without a cache coherence protocol-a particularly demanding application that requires fast and fine-grained thread movement-and show that performance is competitive with traditional shared memory mechanisms. Finally, we describe a real-world implementation of both concepts in a 110-core single-chip multiprocessor in 45nm ASIC technology.en_US
dc.description.statementofresponsibilityby Mieszko Lis.en_US
dc.format.extent113 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleHardware-level fine-grained thread migrationen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc900002358en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record