Software orchestration of instruction level parallelism on tiled processor architectures
Author(s)Lee, Walter (Walter Cheng-Wan)
Software orchestration of ILP on TPAs
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Anant Agarwal and Saman Amarasinghe.
MetadataShow full item record
Projection from silicon technology is that while transistor budget will continue to blossom according to Moore's law, latency from global wires will severely limit the ability to scale centralized structures at high frequencies. A tiled processor architecture (TPA) eliminates long wires from its design by distributing its resources over a pipelined interconnect. By exposing the spatial distribution of these resources to the compiler, a TPA allows the compiler to optimize for locality, thus minimizing the distance that data needs to travel to reach the consuming computation. This thesis examines the compiler problem of exploiting instruction level parallelism (ILP) on a TPA. It describes Rawcc, an ILP compiler for Raw, a fully distributed TPA. The thesis examines the implication of the resource distribution on the exploitation of ILP for each of the following resources: instructions, registers, control, data memory, and wires. It designs novel solutions for each one, and it describes the solutions within the integrated framework of a working compiler. Performance is evaluated on a cycle-accurate Raw simulator as well as on a 16-tile Raw chip. Results show that Rawcc can attain modest speedups for fine-grained applications, as well speedups that scale up to 64 tiles for applications with such parallelism.
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 135-138).
DepartmentMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.