Show simple item record

dc.contributor.advisorSanchez, Daniel
dc.contributor.authorLee, Hyun Ryong
dc.date.accessioned2022-08-29T16:10:21Z
dc.date.available2022-08-29T16:10:21Z
dc.date.issued2022-05
dc.date.submitted2022-06-21T19:25:45.947Z
dc.identifier.urihttps://hdl.handle.net/1721.1/144767
dc.description.abstractBenchmarks that closely match the behavior of production workloads are crucial to design and provision computer systems. However, current approaches fall short: First, open-source benchmarks use public datasets that cause different behavior from production workloads. Second, black-box workload cloning techniques generate synthetic code that imitates the target workload, but the resulting program fails to capture most workload characteristics, such as microarchitectural bottlenecks or time-varying behavior. Generating code that mimics a complex application is an extremely hard problem. Instead, this thesis proposes a different and easier approach to benchmark synthesis. The key insight is that for many production workloads the program is publicly available, or there is a reasonably similar open-source program. In this case, generating the right dataset is sufficient to produce an accurate benchmark. Based on this observation, this thesis presents Datamime, a profile-guided approach to generate representative benchmarks for production workloads. Datamime uses the performance profiles of a target workload to generate a dataset that, when used by a benchmark program, behaves very similarly to the target workload in terms of its microarchitectural characteristics. We evaluate Datamime on several datacenter workloads. Datamime generates synthetic benchmarks that closely match the microarchitectural features of these workloads, with a mean absolute percentage error of 4% on IPC. Microarchitectural behavior stays close across processor types. Finally, time-varying behaviors are also replicated, making these benchmarks useful to e.g. characterize and optimize tail latency.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleGenerating Representative Benchmarks by Automatically Synthesizing Datasets
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcid0000-0002-8627-2781
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record