Show simple item record

dc.contributor.authorAarrestad, Thea
dc.contributor.authorLoncar, Vladimir
dc.contributor.authorGhielmetti, Nicolò
dc.contributor.authorPierini, Maurizio
dc.contributor.authorSummers, Sioni
dc.contributor.authorNgadiuba, Jennifer
dc.contributor.authorPetersson, Christoffer
dc.contributor.authorLinander, Hampus
dc.contributor.authorIiyama, Yutaro
dc.contributor.authorDi Guglielmo, Giuseppe
dc.contributor.authorDuarte, Javier
dc.contributor.authorHarris, Philip
dc.contributor.authorRankin, Dylan
dc.contributor.authorJindariani, Sergo
dc.contributor.authorPedro, Kevin
dc.contributor.authorTran, Nhan
dc.contributor.authorLiu, Mia
dc.contributor.authorKreinar, Edward
dc.contributor.authorWu, Zhenbin
dc.contributor.authorHoang, Duc
dc.date.accessioned2022-04-26T18:31:03Z
dc.date.available2022-04-26T18:31:03Z
dc.date.issued2021
dc.identifier.urihttps://hdl.handle.net/1721.1/142113
dc.description.abstract<jats:title>Abstract</jats:title> <jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monospace> library, we demonstrate an inference latency of 5 <jats:italic>µ</jats:italic>s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.</jats:p>en_US
dc.language.isoen
dc.publisherIOP Publishingen_US
dc.relation.isversionof10.1088/2632-2153/AC0EA1en_US
dc.rightsCreative Commons Attribution 4.0 International licenseen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceIOP Publishingen_US
dc.titleFast convolutional neural networks on FPGAs with hls4mlen_US
dc.typeArticleen_US
dc.identifier.citationAarrestad, Thea, Loncar, Vladimir, Ghielmetti, Nicolò, Pierini, Maurizio, Summers, Sioni et al. 2021. "Fast convolutional neural networks on FPGAs with hls4ml." Machine Learning: Science and Technology, 2 (4).
dc.contributor.departmentMassachusetts Institute of Technology. Department of Physics
dc.relation.journalMachine Learning: Science and Technologyen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2022-04-26T18:26:14Z
dspace.orderedauthorsAarrestad, T; Loncar, V; Ghielmetti, N; Pierini, M; Summers, S; Ngadiuba, J; Petersson, C; Linander, H; Iiyama, Y; Di Guglielmo, G; Duarte, J; Harris, P; Rankin, D; Jindariani, S; Pedro, K; Tran, N; Liu, M; Kreinar, E; Wu, Z; Hoang, Den_US
dspace.date.submission2022-04-26T18:26:17Z
mit.journal.volume2en_US
mit.journal.issue4en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record