Fast convolutional neural networks on FPGAs with hls4ml

Aarrestad, Thea; Loncar, Vladimir; Ghielmetti, Nicolò; Pierini, Maurizio; Summers, Sioni; Ngadiuba, Jennifer; Petersson, Christoffer; Linander, Hampus; Iiyama, Yutaro; Di Guglielmo, Giuseppe; Duarte, Javier; Harris, Philip; Rankin, Dylan; Jindariani, Sergo; Pedro, Kevin; Tran, Nhan; Liu, Mia; Kreinar, Edward; Wu, Zhenbin; Hoang, Duc

Author(s)

Aarrestad, Thea; Loncar, Vladimir; Ghielmetti, Nicolò; Pierini, Maurizio; Summers, Sioni; ... Show more

DownloadPublished version (2.183Mb)

Publisher with Creative Commons License

Terms of use

Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/

Metadata

Show full item record

Abstract

Abstract We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 µs using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.

Date issued

2021

URI

https://hdl.handle.net/1721.1/142113

Department

Massachusetts Institute of Technology. Department of Physics

Journal

Machine Learning: Science and Technology

Publisher

IOP Publishing

Citation

Aarrestad, Thea, Loncar, Vladimir, Ghielmetti, Nicolò, Pierini, Maurizio, Summers, Sioni et al. 2021. "Fast convolutional neural networks on FPGAs with hls4ml." Machine Learning: Science and Technology, 2 (4).

Version: Final published version

Collections

MIT Open Access Articles