Energy-Efficient Speaker Identification with Low-Precision Networks
Author(s)
Koppula, Skanda K.; Glass, James R; Chandrakasan, Anantha P.
DownloadAccepted version (184.9Kb)
Terms of use
Metadata
Show full item recordAbstract
Power-consumption in small devices is dominated by off-chip memory accesses, necessitating small models that can fit in on-chip memory. In the task of text-dependent speaker identification, we demonstrate a 16x byte-size reduction for state-of-art small-footprint LCN/CNN/DNN speaker identification models. We achieve this by using ternary quantization that constrains the weights to {-1, 0, 1}. Our model comfortably fits in the 1 MB on-chip BRAM of most off-the-shelf FPGAs, allowing for a power-efficient speaker ID implementation with 100x fewer floating point multiplications, and a 1000x decrease in estimated energy cost. Additionally, we explore the use of depth-wise separable convolutions for speaker identification, and show while significantly reducing multiplications in full-precision networks, they perform poorly when ternarized. We simulate hardware designs for inference on our model, the first hardware design targeted for efficient evaluation of ternary networks and end-to-end neural network-based speaker identification.
Date issued
2018-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Koppula, Skanda et al. "Energy-Efficient Speaker Identification with Low-Precision Networks." 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2018, Calgary, AB, Canada, Institute of Electrical and Electronics Engineers (IEEE), September 2018. © 2018 IEEE
Version: Author's final manuscript
ISSN
2379-190X