CMS distributed computing workflow experience
Author(s)Adelman-McCarthy, Jennifer; Gutsche, O.; Haas, Jeffrey D.; Prosper, Harrison B.; Dutta, Valentina; Gomez-Ceballos, Guillelmo; Hahn, Kristian Allan; Klute, Markus; Mohapatra, Ajit; Spinoso, Vincenzo; Kcira, Dorian; Caudron, Julien; Liao, Junhui; Pin, Arnaud; Schul, Nicolas; De Lentdecker, G.; McCartin, Joseph; Vanelderen, Lukas; Janssen, X.; Tsyganov, Andrey; Barge, D.; Lahiff, Andrew; ... Show more Show less
MetadataShow full item record
The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.
DepartmentMassachusetts Institute of Technology. Laboratory for Nuclear Science
Journal of Physics Conference Series
Institute of Physics Publishing
Adelman-McCarthy, Jennifer et al. "CMS distributed computing workflow experience." In International Conference on Computing in High Energy and Nuclear Physics (CHEP 2010) IOP Publishing (Journal of Physics: Conference Series 331) (2011) 072019.
Author's final manuscript