ASR on speech reconstructed from short-time fourier phase spectra
There are no files associated with this record.
| Title | ASR on speech reconstructed from short-time fourier phase spectra |
|---|---|
| Author | Alsteris, Leigh; Paliwal, Kuldip Kumar |
| Publication Title | Interspeech 2004 (ICSLP) |
| Editor | Soon Hyob Kim and Dae Hee Youn |
| Year Published | 2004 |
| Place of publication | Korea |
| Publisher | Sunjin Printing Co. |
| Abstract | In our earlier papers, we have measured human intelligibility of speech stimuli reconstructed either from the short-time magnitude spectra (magnitude-only stimuli) or the short-time phase spectra (phase-only stimuli) of a speech stimulus. We demonstrated that, even for small analysis window durations of 20-40 ms (of relevance to automatic speech recognition), the short-time phase spectrum can contribute to speech intelligibility as much as the short-time magnitude spectrum. In this paper, we perform automatic speech recognition on magnitude-only and phase-only stimuli. When employing an MFCC-based front-end, the recognition achieved for these phase-only stimuli is much worse than magnitude-only stimuli at small analysis window durations, which is not consistent with their corresponding human intelligibility results. This implies that the MFCC feature set is not capturing all of the discriminating information present in the speech signal. |
| Peer Reviewed | Yes |
| Published | Yes |
| Publisher URI | http://www.isca-speech.org/index.php |
| Alternative URI | http://www.isca-speech.org/archive/interspeech_2004/ |
| ISBN | 1225-441X |
| Conference name | 8th International Conference on Spoken Language Processing |
| Location | Jeju, Korea |
| Date From | 2004-10-04 |
| Date To | 2004-10-08 |
| URI | http://hdl.handle.net/10072/2123 |
| Date Accessioned | 2005-03-31 |
| Date Available | 2009-09-21T05:50:47Z |
| Language | en_AU |
| Research Centre | Institute for Integrated and Intelligent Systems |
| Faculty | Faculty of Engineering and Information Technology |
| Subject | PRE2009-Speech Recognition |
| Publication Type | Conference Publications (Full Written Paper - Refereed) |
| Publication Type Code | e1 |
Please use this identifier to cite this record: http://hdl.handle.net/10072/2123
Griffith University copyright notice
Copyright in individual works within the repository belongs to their authors or publishers. You may make a print or digital copy of a work for your personal non-commercial use. All other rights are reserved, except for fair dealings or other user rights granted by the copyright laws of your country.
Back to top