Algorithms and low power hardware for keyword spotting

Wang, Miaorong

dc.contributor.advisor	Anantha P. Chandrakasan.	en_US
dc.contributor.author	Wang, Miaorong	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2018-09-17T15:54:43Z
dc.date.available	2018-09-17T15:54:43Z
dc.date.copyright	2018	en_US
dc.date.issued	2018	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/118035
dc.description	Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 73-76).	en_US
dc.description.abstract	Keyword spotting (KWS) is widely used in mobile devices to provide hands-free interface. It continuously listens to all sound signals, detects specific keywords and triggers the downstream system. The key design target of a KWS system is to achieve high classification accuracy of specified keywords and have low power consumption while doing real-time processing of speech data. The algorithm based on convolutional neural network (CNN) delivers high accuracy with small model size that can be stored in on-chip memory. However, the state-of-the-art NN accelerators either target at complex tasks using large CNN models, e.g. AlexNet, or support limited neural network (NN) architectures which delivers lower classification accuracy for KWS. This thesis takes an algorithm-and-hardware co-design approach to implement a low power NN accelerator for the KWS system that is able to process CNN with flexible structures. On the algorithm side, we propose a weight tuning method that tweaks the bits of weights to lower the switching activity in the weight network-on-chip (NoC) and multipliers. The algorithm takes in 2's complement 8-bit original weights and outputs sign-magnitude 8-bit tuned weights. In our experiment, 60.96% reduction in the toggle count of weights is achieved with 0.75% loss in accuracy. On the hardware side, we implement a processing element (PE) to efficiently process the tuned weights. It takes in sign-magnitude weights and input activations, and multiplies them by an unsigned multiplier. An XOR gate is used to generate the sign bit of the product. The sign-magnitude product is converted back to 2's complement representation and accumulated using an adder-and-subtractor. The sign bit of the product is used as a carry bit to do the conversion. Comparing to the PE that processes original 2's complement weights, around 35% power reduction is observed. In the end, this thesis presents a CNN accelerator that consumes 1.2 mW when doing real-time processing of speech data with an accuracy of around 87.3% on Google speech command dataset [34].	en_US
dc.description.statementofresponsibility	by Miaorong Wang.	en_US
dc.format.extent	76 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Algorithms and low power hardware for keyword spotting	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	1051458925	en_US

Files in this item

Name:: 1051458925-MIT.pdf
Size:: 7.945Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record