CMS distributed computing workflow experience

Adelman-McCarthy, Jennifer; Gutsche, Oliver; Haas, Jeffrey D; Prosper, Harrison B; Dutta, Valentina; Gomez-Ceballos, Guillelmo; Hahn, Kristian; Klute, Markus; Mohapatra, Ajit; Spinoso, Vincenzo; Kcira, Dorian; Caudron, Julien; Liao, Junhui; Pin, Arnaud; Schul, Nicolas; Lentdecker, Gilles De; McCartin, Joseph; Vanelderen, Lukas; Janssen, Xavier; Tsyganov, Andrey; Barge, Derek; Lahiff, Andrew

Author(s)

Adelman-McCarthy, Jennifer; Gutsche, O.; Haas, Jeffrey D.; Prosper, Harrison B.; Dutta, Valentina; ... Show more

DownloadCR2010_273 Klute.pdf (418.0Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/

Metadata

Show full item record

Abstract

The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.

Date issued

2011

URI

http://hdl.handle.net/1721.1/78921

Department

Massachusetts Institute of Technology. Laboratory for Nuclear Science

Journal

Journal of Physics Conference Series

Publisher

Institute of Physics Publishing

Citation

Adelman-McCarthy, Jennifer et al. "CMS distributed computing workflow experience." In International Conference on Computing in High Energy and Nuclear Physics (CHEP 2010) IOP Publishing (Journal of Physics: Conference Series 331) (2011) 072019.

Version: Author's final manuscript

ISSN

1742-6588

1742-6596

Collections

MIT Open Access Articles

DSpace@MIT