Using profiling to improve the performance of automatically parallelized programs
Author(s)Nutile, Domenic Jeffrey.
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
MetadataShow full item record
Modern processors are reaching hundreds of cores, and the need for highly parallel programs is at an all-time high. Speculative execution is a promising approach to automatically parallelize sequential programs, but even the best speculative parallelizing compilers and hardware architectures fail to unlock all available parallelism on every program. These limitations are often caused by the structure of the original sequential program causing tasks to be unnecessarily dependent, limiting speedups. This thesis presents TProf, a system that dynamically profiles automatically parallelized sequential programs to find parallelism bottlenecks. Our implementation of TProf targets the T4 compiler on the Swarm architecture. T4 leverages Swarm's fine-grain speculation and high scalability to enable novel compiler parallelization techniques. TProf targets the programs created by T4, analyzing the execution of the generated task structures to identify key points of parallel contention at the LLVM intermediate representation level. TProf processes the collected information, relating performance bottlenecks to the source code which helps programmers quickly and easily decide how to enact transformations to unlock maximum parallelism.
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 37-39).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.