Modeling exascale data generation and storage for the large hadron collider computing network
Author(s)Massaro, Evan K.
Massachusetts Institute of Technology. Computation for Design and Optimization Program.
Markus Klute and John Williams.
MetadataShow full item record
The Large Hadron Collider (LHC) is the world's largest and highest energy particle accelerator. With the particle collisions produced at the LHC and measured with the Compact Muon Solenoid (CMS) detector, the CMS experimental group performs precision measurements and general searches for new physics. Year-round CMS operations produce 100 Petabytes of physics data per year, which is stored within a globally distributed grid network of 70 scientific institutions. By 2027, upgrades to the LHC and CMS detector will allow unprecedented probes of microscopic physics, but in doing so generate 2,000 Petabytes (2 Exabytes) of physics data per year. To address the computational requirements of CMS, the cost of CPU resources, disk and tape storage, and tape drives were modeled. These resources were then used in a model of the major CMS computing processes and required infrastructure.In addition to estimating budget requirements, this model produced bandwidth requirements, for which the transatlantic network cable was explicitly addressed. Given discrete or continuously parameterized policy decisions, the system cost and required network bandwidth could be modeled as a function of the policy. This sensitivity analysis was coupled to an uncertainty quantification of the model outputs, which were functions of the estimated system parameters. The expected value of the system cost and maximum transatlantic network activity were modeled to increase 40 times in 2027 relative to 2018. In 2027 the required transatlantic network capacity was modeled to have an expected value of 210 Gbps, with a 95% confidence interval that reaches 330 Gbps, just under the current bandwidth of 340 Gbps. By changing specific computing policies, the system cost and network load were shown to decrease.Specific policies can reduce the network load to an expected value of 150 Gbps, with a 95% confidence interval that reaches 260 Gbps. Given the unprecedented volume of data, such policy changes can allow CMS to meet its future physics goals.
Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 85-86).
DepartmentMassachusetts Institute of Technology. Computation for Design and Optimization Program
Massachusetts Institute of Technology
Computation for Design and Optimization Program.