A 6 mW, 5,000-Word Real-Time Speech Recognizer Using WFST Models

Price, Michael; Glass, James; Chandrakasan, Anantha P.

Author(s)

Chandrakasan, Anantha P.; Price, Michael R.; Glass, James R.

Downloadasrv1_journal_short.pdf (2.748Mb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

We describe an IC that provides a local speech recognition capability for a variety of electronic devices. We start with a generic speech decoder architecture that is programmable with industry-standard WFST and GMM speech models. Algorithm and architectural enhancements are incorporated in order to achieve real-time performance amid system-level constraints on internal memory size and external memory bandwidth. A 2.5 × 2.5 mm test chip implementing this architecture was fabricated using a 65 nm process. The chip performs a 5,000 word recognition task in real-time with 13.0% word error rate, 6.0 mW core power consumption, and a search efficiency of approximately 16 nJ per hypothesis.

Date issued

2014-12

URI

http://hdl.handle.net/1721.1/102176

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

IEEE Journal of Solid-State Circuits

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

Price, Michael, James Glass, and Anantha P. Chandrakasan. “A 6 mW, 5,000-Word Real-Time Speech Recognizer Using WFST Models.” IEEE Journal of Solid-State Circuits 50, no. 1 (January 2015): 102–112.

Version: Author's final manuscript

ISSN

0018-9200

1558-173X

Collections

MIT Open Access Articles

DSpace@MIT