Robust Speech Recognition in Noisy Environments Based on Subband Spectral Centroid Histograms

File Size Format
41666_1.pdf 605Kb Adobe PDF View
Title Robust Speech Recognition in Noisy Environments Based on Subband Spectral Centroid Histograms
Author Gajic, Bojana; Paliwal, Kuldip Kumar
Journal Name IEEE Transactions on Audio, Speech and Language Processing
Year Published 2006
Place of publication USA
Publisher IEEE
Abstract We investigate how dominant-frequency information can be used in speech feature extraction to increase the robustness of automatic speech recognition against additive background noise. First, we review several earlier proposed auditory-based feature extraction methods and argue that the use of dominant-frequency information might be one of the major reasons for their improved noise robustness. Furthermore, we propose a new feature extraction method, which combines subband power information with dominant subband frequency information in a simple and computationally efficient way. The proposed features are shown to be considerably more robust against additive background noise than standard mel-frequency cepstrum coefficients on two different recognition tasks. The performance improvement increased as we moved from a small-vocabulary isolated-word task to a medium-vocabulary continuous-speech task, where the proposed features also outperformed a computationally expensive auditory-based method. The greatest improvement was obtained for noise types characterized by a relatively flat spectral density.
Peer Reviewed Yes
Published Yes
Publisher URI http://ieeexplore.ieee.org/servlet/opac?punumber=10376
Alternative URI http://dx.doi.org/10.1109/TSA.2005.855834
Copyright Statement Copyright 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Volume 14
Issue Number 2
Page from 600
Page to 608
ISSN 1558-7916
Date Accessioned 2007-03-18
Language en_AU
Research Centre Institute for Integrated and Intelligent Systems
Faculty Faculty of Science, Environment, Engineering and Technology
Subject PRE2009-Speech Recognition
URI http://hdl.handle.net/10072/14345
Publication Type Journal Articles (Refereed Article)
Publication Type Code c1

Show simple item record

Griffith University copyright notice