Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies
There are no files associated with this record.
| Title | Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies |
|---|---|
| Author | So, Stephen; Paliwal, Kuldip Kumar |
| Publication Title | Interspeech 2005 - Eurospeech |
| Editor | Luís Oliveira |
| Year Published | 2005 |
| Place of publication | Lisbon, Portugal |
| Publisher | International Speech Communication Association (ISCA) |
| Abstract | In this paper, we examine a coding scheme for quantising feature vectors in a distributed speech recognition environment that is more robust to noise. It consists of a vector quantiser that operates on the logarithmic filterbank energies (LFBEs). Through the use of a perceptually-weighted Euclidean distance measure, which emphasises the LFBEs that represent the spectral peaks, the vector quantiser codebook provides \emph{a priori} knowledge of the spectral characteristics of clean speech and is used to quantise features from noise-corrupted speech. Our comparative results from the ETSI Aurora-2 recognition task show that the perceptually-weighted vector quantisation of LFBEs achieves higher recognition accuracies for noisy speech than the unweighted vector quantisation, memoryless and multi-frame GMM-based block quantisation and scalar quantisation of Mel frequency-warped cepstral coefficients. |
| Peer Reviewed | Yes |
| Published | Yes |
| ISBN | 10184074 |
| Conference name | 9th European Conference on Speech Communication and Technology |
| Location | Lisbon, Portugal |
| Date From | 2005-09-04 |
| Date To | 2005-09-08 |
| URI | http://hdl.handle.net/10072/2673 |
| Date Accessioned | 2006-02-22 |
| Date Available | 2007-03-21T21:26:49Z |
| Language | en_AU |
| Research Centre | Institute for Integrated and Intelligent Systems |
| Faculty | Faculty of Engineering and Information Technology |
| Subject | Signal Processing; Speech Recognition |
| Publication Type | Conference Publications (Full Written Paper - Refereed) |
| Publication Type Code | e1 |
Please use this identifier to cite this record: http://hdl.handle.net/10072/2673
Griffith University copyright notice
Copyright in individual works within the repository belongs to their authors or publishers. You may make a print or digital copy of a work for your personal non-commercial use. All other rights are reserved, except for fair dealings or other user rights granted by the copyright laws of your country.
Back to top