MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition
There are no files associated with this record.
| Title | MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition |
|---|---|
| Author | Shannon, Ben James; Paliwal, Kuldip Kumar |
| Publication Title | Interspeech 2004 (ICSLP) |
| Editor | Soon Hyob Kim and Dae Hee Youn |
| Year Published | 2004 |
| Place of publication | Korea |
| Publisher | Sunjin Printing Co. |
| Abstract | Processing of the speech signal in the autocorrelation domain in the context of robust feature extraction is based on the following two properties: 1) pole preserving property (the poles of a given (original) signal are preserved in its autocorrelation function), and 2) noise separation property (the autocorrelation function of a noise signal is confined to lower lags, while the speech signal contribution is spread over all the lags in the autocorrelation function, thus providing a way to eliminate noise by discarding lower-lag autocorrelation coefficients). In this paper, we use these properties to derive robust features for automatic speech recognition. We compute the magnitude spectrum of the one-sided higher-lag autocorrelation sequence, process it through a Mel filter bank and parameterise it in terms of Mel Frequency Cepstral Coefficients (MFCCs). Since the proposed method combines autocorrelation domain processing with Mel filter bank analysis, we call the resulting MFCCs, Autocorrelation Mel Frequency Cepstral Coefficients (AMFCCs). Recognition experiments are conducted on the Aurora II database and it is found that the AMFCC representation performs as well as the MFCC representation in clean conditions and provides more robust performance in the presence of background noise. |
| Peer Reviewed | Yes |
| Published | Yes |
| Publisher URI | http://www.isca-speech.org/index.php |
| Alternative URI | http://www.isca-speech.org/archive/interspeech_2004/ |
| ISBN | 1225-441X |
| Conference name | 8th International Conference on Spoken Language Processing |
| Location | Jeju Island, Korea |
| Date From | 2004-10-04 |
| Date To | 2004-10-08 |
| URI | http://hdl.handle.net/10072/2122 |
| Date Accessioned | 2005-03-31 |
| Date Available | 2009-09-21T05:49:16Z |
| Language | en_AU |
| Research Centre | Institute for Integrated and Intelligent Systems |
| Faculty | Faculty of Engineering and Information Technology |
| Subject | PRE2009-Speech Recognition |
| Publication Type | Conference Publications (Full Written Paper - Refereed) |
| Publication Type Code | e1 |
Please use this identifier to cite this record: http://hdl.handle.net/10072/2122
Griffith University copyright notice
Copyright in individual works within the repository belongs to their authors or publishers. You may make a print or digital copy of a work for your personal non-commercial use. All other rights are reserved, except for fair dealings or other user rights granted by the copyright laws of your country.
Back to top