Show simple item record

dc.contributor.authorChaze, Olivier
dc.contributor.authorAndré, Jean-Marc
dc.contributor.authorAndronidis, Anastasios
dc.contributor.authorBehrens, Ulf
dc.contributor.authorBranson, James
dc.contributor.authorBrummer, Philipp
dc.contributor.authorContescu, Cristian
dc.contributor.authorCittolin, Sergio
dc.contributor.authorCraigs, Benjamin
dc.contributor.authorDarlea, Georgiana-Lavinia
dc.contributor.authorDeldicque, Christian
dc.contributor.authorDemiragli, Zeynep
dc.contributor.authorDobson, M.
dc.contributor.authorDoualot, Nicolas
dc.contributor.authorErhan, Samim
dc.contributor.authorFulcher, Jonathan Richard
dc.contributor.authorGigi, Dominique
dc.contributor.authorGlege, Frank
dc.contributor.authorGomez-Ceballos, Guillelmo
dc.contributor.authorHegeman, Jeroen
dc.contributor.authorHOLZNER, Andre Georg
dc.contributor.authorJimenez-Estupiñán, Raul
dc.contributor.authorMasetti, Lorenzo
dc.contributor.authorMEIJERS, Frans
dc.contributor.authorMeschi, Emilio
dc.contributor.authorMommsen, Remigius
dc.contributor.authorMorovic, Srecko
dc.contributor.authorO'Dell, Vivian
dc.contributor.authorOrsini, Luciano
dc.contributor.authorPaus, Christoph M. E.
dc.contributor.authorPieri, Marco
dc.contributor.authorRacz, Attila
dc.contributor.authorSakulin, Hannes
dc.contributor.authorSchwick, Christoph
dc.contributor.authorReis, Thomas
dc.contributor.authorSimelevicius, Dainius
dc.contributor.authorZejdl, Petr
dc.date.accessioned2019-07-15T20:16:36Z
dc.date.available2019-07-15T20:16:36Z
dc.date.issued2017-01
dc.identifier.urihttps://hdl.handle.net/1721.1/121615
dc.description.abstractAfter two years of maintenance and upgrade, the Large Hadron Collider (LHC), the largest and most powerful particle accelerator in the world, has started its second three year run. Around 1500 computers make up the CMS (Compact Muon Solenoid) Online cluster. This cluster is used for Data Acquisition of the CMS experiment at CERN, selecting and sending to storage around 20 TBytes of data per day that are then analysed by the Worldwide LHC Computing Grid (WLCG) infrastructure that links hundreds of data centres worldwide. 3000 CMS physicists can access and process data, and are always seeking more computing power and data. The backbone of the CMS Online cluster is composed of 16000 cores which provide as much computing power as all CMS WLCG Tier1 sites (352K HEP-SPEC-06 score in the CMS cluster versus 300K across CMS Tier1 sites). The computing power available in the CMS cluster can significantly speed up the processing of data, so an effort has been made to allocate the resources of the CMS Online cluster to the grid when it isn't used to its full capacity for data acquisition. This occurs during the maintenance periods when the LHC is non-operational, which corresponded to 117 days in 2015. During 2016, the aim is to increase the availability of the CMS Online cluster for data processing by making the cluster accessible during the time between two physics collisions while the LHC and beams are being prepared. This is usually the case for a few hours every day, which would vastly increase the computing power available for data processing. Work has already been undertaken to provide this functionality, as an OpenStack cloud layer has been deployed as a minimal overlay that leaves the primary role of the cluster untouched. This overlay also abstracts the different hardware and networks that the cluster is composed of. The operation of the cloud (starting and stopping the virtual machines) is another challenge that has been overcome as the cluster has only a few hours spare during the aforementioned beam preparation. By improving the virtual image deployment and integrating the OpenStack services with the core services of the Data Acquisition on the CMS Online cluster it is now possible to start a thousand virtual machines within 10 minutes and to turn them off within seconds. This document will explain the architectural choices that were made to reach a fully redundant and scalable cloud, with a minimal impact on the running cluster configuration while giving a maximal segregation between the services. It will also present how to cold start 1000 virtual machines 25 times faster, using tools commonly utilised in all data centres.en_US
dc.language.isoen
dc.publisherSissa Medialaben_US
dc.relation.isversionofhttp://dx.doi.org/10.22323/1.270.0022en_US
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivs Licenseen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/en_US
dc.sourceProceedings of Scienceen_US
dc.titleOpportunistic usage of the CMS online cluster using a cloud overlayen_US
dc.typeArticleen_US
dc.identifier.citationChaze, O. et al. "Opportunistic usage of the CMS online cluster using a cloud overlay." International Symposium on Grids and Clouds (ISGC) 2016, March 2016, Academia Sinica, Taipei, Taiwan, Sissa Medialab, 2016 © 2016 The Authorsen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Nuclear Science
dc.contributor.departmentMassachusetts Institute of Technology. Department of Physics
dc.relation.journalProceedings of Scienceen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-04-29T19:56:38Z
dspace.date.submission2019-04-29T19:56:40Z
mit.journal.volume270en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record