Closed-loop auditory-based representation for robust speech recognition
Author(s)Lee, Chia-ying (Chia-ying Jackie)
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
James R. Glass and Oded Ghitza.
MetadataShow full item record
A closed-loop auditory based speech feature extraction algorithm is presented to address the problem of unseen noise for robust speech recognition. This closed-loop model is inspired by the possible role of the medial olivocochlear (MOC) efferent system of the human auditory periphery, which has been suggested in [6, 13, 42] to be important for human speech intelligibility in noisy environment. We propose that instead of using a fixed filter bank, the filters used in a feature extraction algorithm should be more flexible to adapt dynamically to different types of background noise. Therefore, in the closed-loop model, a feedback mechanism is designed to regulate the operating points of filters in the filter bank based on the background noise. The model is tested on a dataset created from TIDigits database. In this dataset, five kinds of noise are added to synthesize noisy speech. Compared with the standard MFCC extraction algorithm, the proposed closed-loop form of feature extraction algorithm provides 9.7%, 9.1% and 11.4% absolution word error rate reduction on average for three kinds of filter banks respectively.
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Includes bibliographical references (p. 93-96).
DepartmentMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.