Scalable distributed speech recognition using multi-frame GMM-based block quantization

There are no files associated with this record.

Title Scalable distributed speech recognition using multi-frame GMM-based block quantization
Author Paliwal, Kuldip Kumar; So, Stephen
Publication Title Interspeech 2004 (ICSLP)
Editor Soon Hyob Kim and Dae Hee Youn
Year Published 2004
Place of publication Korea
Publisher Sunjin Printing Co.
Abstract In this paper, we propose the use of the multi-frame Gaussian mixture model-based block quantizer for the coding of Mel frequency-warped cepstral coefficient (MFCC) features in distributed speech recognition (DSR) applications. This coding scheme exploits intraframe correlation via the Karhunen-Loeve transform (KLT) and interframe correlation via the joint processing of adjacent frames together with the computational simplicity of scalar quantization. The proposed coder is bit-rate scalable, which means that the bitrate can be adjusted without the need for re-training of the quantizers. Static parameters such as the probability density function (PDF) model and KLT orthogonal matrices are stored at the encoder and decoder and bit allocations are calculated 'on-the-fly' without intensive processing. This coding scheme is evaluated in this paper on the Aurora-2 database in a DSR framework. It is shown that this coding scheme achieves high recognition performance at lower bitrates, with a word error rate (WER) of 2.5% at 800 bps, which is less than 1% degradation from the baseline word recognition accuracy, and graceful degradation down to a WER of 7% at 300 bps.
Peer Reviewed Yes
Published Yes
Publisher URI http://www.isca-speech.org/index.php
Alternative URI http://www.isca-speech.org/archive/interspeech_2004/
ISBN 1225-441X
Conference name 8th International Conference on Spoken Language Processing (ICSLP-2004)
Location Jeju, Korea
Date From 2004-10-04
Date To 2004-10-08
URI http://hdl.handle.net/10072/2117
Date Accessioned 2005-03-31
Date Available 2009-09-22T05:48:56Z
Language en_AU
Research Centre Institute for Integrated and Intelligent Systems
Faculty Faculty of Engineering and Information Technology
Subject PRE2009-Signal Processing; PRE2009-Speech Recognition
Publication Type Conference Publications (Full Written Paper - Refereed)
Publication Type Code e1

Brief Record

Griffith University copyright notice