FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

Duarte, Javier; Harris, Philip; Hauck, Scott; Holzman, Burt; Hsu, Shih-Chieh; Jindariani, Sergo; Khan, Suffian; Kreis, Benjamin; Lee, Brian; Liu, Mia; Lončar, Vladimir; Ngadiuba, Jennifer; Pedro, Kevin; Perez, Brandon; Pierini, Maurizio; Rankin, Dylan

dc.contributor.author	Duarte, Javier
dc.contributor.author	Harris, Philip
dc.contributor.author	Hauck, Scott
dc.contributor.author	Holzman, Burt
dc.contributor.author	Hsu, Shih-Chieh
dc.contributor.author	Jindariani, Sergo
dc.contributor.author	Khan, Suffian
dc.contributor.author	Kreis, Benjamin
dc.contributor.author	Lee, Brian
dc.contributor.author	Liu, Mia
dc.contributor.author	Lončar, Vladimir
dc.contributor.author	Ngadiuba, Jennifer
dc.contributor.author	Pedro, Kevin
dc.contributor.author	Perez, Brandon
dc.contributor.author	Pierini, Maurizio
dc.contributor.author	Rankin, Dylan
dc.date.accessioned	2021-09-20T17:17:14Z
dc.date.available	2021-09-20T17:17:14Z
dc.date.issued	2019-10-14
dc.identifier.uri	https://hdl.handle.net/1721.1/131480
dc.description.abstract	Abstract Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) ms with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600–700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.	en_US
dc.publisher	Springer International Publishing	en_US
dc.relation.isversionof	https://doi.org/10.1007/s41781-019-0027-2	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	Springer International Publishing	en_US
dc.title	FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing	en_US
dc.type	Article	en_US
dc.identifier.citation	Computing and Software for Big Science. 2019 Oct 14;3(1):13	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Physics
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2020-09-24T21:19:38Z
dc.language.rfc3066	en
dc.rights.holder	Springer Nature Switzerland AG
dspace.embargo.terms	Y
dspace.date.submission	2020-09-24T21:19:38Z
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Authority Work and Publication Information Needed

Files in this item

Name:: 41781_2019_27_ReferencePDF.pdf
Size:: 1.151Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record