Show simple item record

dc.contributor.authorDuarte, Javier
dc.contributor.authorHarris, Philip
dc.contributor.authorHauck, Scott
dc.contributor.authorHolzman, Burt
dc.contributor.authorHsu, Shih-Chieh
dc.contributor.authorJindariani, Sergo
dc.contributor.authorKhan, Suffian
dc.contributor.authorKreis, Benjamin
dc.contributor.authorLee, Brian
dc.contributor.authorLiu, Mia
dc.contributor.authorLončar, Vladimir
dc.contributor.authorNgadiuba, Jennifer
dc.contributor.authorPedro, Kevin
dc.contributor.authorPerez, Brandon
dc.contributor.authorPierini, Maurizio
dc.contributor.authorRankin, Dylan
dc.date.accessioned2021-09-20T17:17:14Z
dc.date.available2021-09-20T17:17:14Z
dc.date.issued2019-10-14
dc.identifier.urihttps://hdl.handle.net/1721.1/131480
dc.description.abstractAbstract Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) ms with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600–700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.en_US
dc.publisherSpringer International Publishingen_US
dc.relation.isversionofhttps://doi.org/10.1007/s41781-019-0027-2en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceSpringer International Publishingen_US
dc.titleFPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computingen_US
dc.typeArticleen_US
dc.identifier.citationComputing and Software for Big Science. 2019 Oct 14;3(1):13en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Physics
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2020-09-24T21:19:38Z
dc.language.rfc3066en
dc.rights.holderSpringer Nature Switzerland AG
dspace.embargo.termsY
dspace.date.submission2020-09-24T21:19:38Z
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Needed


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record