Chapter

Automatic Speech Recognition on Mobile Devices and over Communication Networks

Part of the series Advances in Pattern Recognition pp 131-161

Quantization of Speech Features: Source Coding

  • Stephen SoAffiliated withGriffith School of Engineering, Signal Processing Laboratory, Griffith University
  • , Kuldip K. PaliwalAffiliated withGriffith School of Engineering, Signal Processing Laboratory, Griffith University

* Final gross prices may vary according to local VAT.

Get Access

In this chapter, we describe various schemes for quantizing speech features to be used in distributed speech recognition (DSR) systems. We analyze the statistical properties of Mel frequency-warped cepstral coefficients (MFCCs) that are most relevant to quantization, namely the correlation and probability density function shape, in order to determine the type of quantization scheme that would be most suitable for quantizing them efficiently. We also determine empirically the relationship between mean squared error and recognition accuracy in order to verify that quantization schemes, which minimize mean squared error, are also guaranteed to improve the recognition performance. Furthermore, we highlight the importance of noise robustness in DSR and describe the use of a perceptually weighted distance measure to enhance spectral peaks in vector quantization. Finally, we present some experimental results on the quantization schemes in a DSR framework and compare their relative recognition performances.