Parallel and Distributed Just-in-Time Shell Script Compilation
Author(s)
Mustafa, Tammam
DownloadThesis PDF (951.7Kb)
Advisor
Vasilakis, Nikos
Rinard, Martin C.
Terms of use
Metadata
Show full item recordAbstract
In the past several years, the shell has received renewed interest from the research community. This thesis describes the work I did to advance the performance and capabilities of the current state-of-the-art shell-script parallelization systems. In the first half of this thesis, I focus on my contributions to PaSh-JIT, a JIT compiler for parallelizing POSIX shell scripts. In the second half, I explore the design and implementation of Distributed-PaSh, a shell that can utilize distributed computing resources and easily interface with distributed storage systems to efficiently execute data-processing pipelines. Distributed-PaSh analyzes the dataflow graph of a given script to create highly parallel data pipelines and execute those pipelines in a distributed cluster while giving special attention to data locality and movement. Distributed-PaSh achieves higher performance than single machine sequential and parallel shells.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology