Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition
There are no files associated with this record.
| Title | Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition |
|---|---|
| Author | Stark, Anthony Phillip; Paliwal, Kuldip Kumar |
| Journal Name | Speech Communication |
| Year Published | 2011 |
| Place of publication | Netherlands |
| Publisher | Elsevier |
| Abstract | In this paper, we investigate the use of the minimum mean square error (MMSE) spectral energy estimator for use in environmentrobust automatic speech recognition (ASR). In the past, it has been common to use the MMSE log-spectral amplitude estimator for this task. However, this estimator was originally derived under subjective human listening criteria. Therefore its complex suppression rule may not be optimal for use in ASR. On the other hand, it can be shown that the MMSE spectral energy estimator is closely related to the MMSE Mel-frequency cepstral coefficient (MFCC) estimator. Despite this, the spectral energy estimator has tended to suffer from the problem of excessive residual noise. We examine the cause of this residual noise and show that the introduction of a heuristic based speech presence uncertainty (SPU) can significantly improve its performance as a front-end ASR enhancement regime. The proposed spectral energy SPU estimator is evaluated on the Aurora2, RM and OLLO2 speech recognition tasks and can be shown to significantly improve additive noise robustness over the more common spectral amplitude and log-spectral amplitude estimators. |
| Peer Reviewed | Yes |
| Published | Yes |
| Alternative URI | http://dx.doi.org/10.1016/j.specom.2010.08.001 |
| Volume | 53 |
| Page from | 51 |
| Page to | 61 |
| ISSN | 0167-6393 |
| Date Accessioned | 2011-11-07; 2012-02-14T05:33:49Z |
| Date Available | 2012-02-14T05:33:49Z |
| Research Centre | Institute for Integrated and Intelligent Systems |
| Faculty | Faculty of Science, Environment, Engineering and Technology |
| Subject | Signal Processing |
| URI | http://hdl.handle.net/10072/42592 |
| Publication Type | Journal Articles (Refereed Article) |
| Publication Type Code | c1 |
Please use this identifier to cite this record: http://hdl.handle.net/10072/42592
Griffith University copyright notice
Copyright in individual works within the repository belongs to their authors or publishers. You may make a print or digital copy of a work for your personal non-commercial use. All other rights are reserved, except for fair dealings or other user rights granted by the copyright laws of your country.
Back to top