Single-channel speech enhancement using Kalman filtering in the modulation domain
| File | Size | Format | |
|---|---|---|---|
| 65040_1.pdf | 1087Kb | Adobe PDF | View |
| Title | Single-channel speech enhancement using Kalman filtering in the modulation domain |
|---|---|
| Author | So, Stephen; Wojcicki, Kamil; Paliwal, Kuldip Kumar |
| Publication Title | Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010) |
| Editor | Satoshi Nakamura |
| Year Published | 2010 |
| Publisher | International Speech Communication Association (ISCA) |
| Abstract | In this paper, we propose the modulation-domain Kalman filter (MDKF) for speech enhancement. In contrast to previous modulation domain enhancement methods based on bandpass filtering, the MDKF is an adaptive and linear MMSE estimator that uses models of the temporal changes of the magnitude spectrum for both speech and noise. Also, because the Kalman filter is a joint magnitude and phase spectrum estimator, under non-stationarity assumptions, it is highly suited for modulation-domain processing, as modulation phase tends to contain more speech information than acoustic phase. Experimental results from the NOIZEUS corpus show the ideal MDKF (with clean speech parameters) to outperform all the acoustic and time-domain enhancement methods that were evaluated, including the conventional time-domain Kalman filter with clean speech parameters. A practical MDKF that uses the MMSE-STSA method to enhance noisy speech in the acoustic domain prior to LPC analysis was also evaluated and showed promising results. |
| Peer Reviewed | Yes |
| Published | Yes |
| Publisher URI | http://www.isca-speech.org/iscaweb/ |
| Alternative URI | http://www.interspeech2010.org |
| Copyright Statement | Copyright 2010 ISCA and the Authors. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher. For information about this conference please refer to the conference's website or contact the authors. |
| ISBN | 1990-9772 |
| Conference name | Interspeech 2010 |
| Location | Makuhari, Japan |
| Date From | 2010-09-26 |
| Date To | 2010-09-30 |
| URI | http://hdl.handle.net/10072/36143 |
| Date Accessioned | 2010-10-19 |
| Date Available | 2011-02-14T09:13:04Z |
| Language | en_AU |
| Research Centre | Institute for Integrated and Intelligent Systems |
| Faculty | Faculty of Science, Environment, Engineering and Technology |
| Subject | Signal Processing |
| Publication Type | Conference Publications (Full Written Paper - Refereed) |
| Publication Type Code | e1 |
Please use this identifier to cite this record: http://hdl.handle.net/10072/36143
Griffith University copyright notice
Copyright in individual works within the repository belongs to their authors or publishers. You may make a print or digital copy of a work for your personal non-commercial use. All other rights are reserved, except for fair dealings or other user rights granted by the copyright laws of your country.
Back to top