Real-time noise-robust speech detection

Luu, Kevin Y

Author(s)

Luu, Kevin Y

DownloadFull printable version (10.95Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.

Advisor

James R. Glass and David Scott Cyphers.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

As part of the development of an autonomous forklift of the Agile Robotics Lab at MIT's Computer Science and Artificial Intelligence Lab (CSAIL), this thesis explores the effectiveness and application of various noise-robust techniques towards real-time speech detection in real environments. Dynamic noises in the environment (including motor noise, babble noise, and other noises in a warehouse setting) can dramatically alter the speech signal, making speech detection much more difficult. In addition to the noise environments, another issue is the urgent nature of the situation, leading to the production of shouted speech. Given these constraints, the forklift must be highly accurate in detecting speech at all times, since safety is a major concern in our application. This thesis analyzes different speech properties that would be useful in distinguishing speech from noise in various noise environments. We look at various features in an effort to optimize the overall shout detection system. In addition to identifying speech features, this thesis also uses common signal processing techniques to enhance the speech signals in audio waveforms. In addition to the optimal speech features and speech enhancement techniques, we present a shout detection algorithm that is optimized towards the application of the autonomous forklift. We measure the performance of the resulting system by comparing it to other baseline systems and show 38% improvement over a baseline task.

Description

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.

Cataloged from PDF version of thesis.

Includes bibliographical references (p. 87-89).

Date issued

2010

URI

http://hdl.handle.net/1721.1/62659

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses