Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks

  • Eros Pasero
  • Alfonso Montuori
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2486)


We developed a real-time wideband speech codec adopting a wavelet packet based methodology. The transform domain coefficients were first quantized by means of a mid-tread uniform quantizer and then encoded with an arithmetic coding. In the first step the wavelet coefficients were quantized by using a psycho-acoustic model. The second step was carried out by adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network. The neural network was trained on the TIMIT corpus and his weights updated in real-time during the compression in order to model better the speech characteristics of the current speaker. The coding/decoding algorithm was first written in C and then optimised on the TMS320C6000 DSP platform.


Wideband speech Competitive Neural Network 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Daubechies, I.: Orthonormal bases of compactly supported wavelets. Comm. Pure Appl. Math., Vol. 4 (1988) 909–996CrossRefMathSciNetGoogle Scholar
  2. [2]
    Daubechies, I.: Ten Lectures on Wavalets. SIAM, Philadelphia, PA (1992)Google Scholar
  3. [3]
    Singh, I., Agathoklis, P., Antoniou, A.: Wavelet-based compression of speech signals on the TMS320C30 digital signal processor. IEEE Symposium on Advances in Digital Filtering and Signal Processing (1998) 178–182Google Scholar
  4. [4]
    Fu, X., Zhang, Z.: TMS320C6000 DSP Multichannel Vocoder Technology Demonstration Kit Host Side Design. Texas Instruments Application Report, Literature Number SPRA558B (2000)Google Scholar
  5. [5]
    Wickerhauser, M.V.: INRIA Lectures on wavelet packet algorithms. Lecture Notes in Computer Science. Problemes Non P.-L. Lions, Ed., Roquencourt, France (1991)Google Scholar
  6. [6]
    Villasenor, J.D., Belzer, B., Liao, J.: Filter Evaluation and Selection in Wavelet Image Compression. Proceedings of IEEE Data Compression Conference (1994) 351–360Google Scholar
  7. [7]
    Vetterli, M., Kovacevic, J.: Wavelets and subband coding. Prentice-Hall, Englewood Cliffs, NJ 1995)zbMATHGoogle Scholar
  8. [8]
    Mallat, S.: A wavelet tour of signal processing. 2nd edn. Academic Press (1998)Google Scholar
  9. [9]
    Johnston, J.D., Sinha, D., Dorward, S., Quackenbush, S.R.: AT&T Perceptual Audio Coder. Collected Papers on Digital Audio Bit-Rate Reduction. N. Gilchrist and C. Grewin, Editors, AES (1996)Google Scholar
  10. [10]
    Jayant, N. S., Noll, P.: Digital Coding of Waveforms: Principles and Applications to Speech and Video. Prentice-Hall, Englewood Cliffs, NJ (1984)Google Scholar
  11. [11]
    Schroeder, M.R., Atal, B.S., Hall, J. L.: Optimizing digital speech coders by exploiting masking properties of the human ear. Journal of the Acoustical Society of America, Vol. 66, no. 6 (1979) 1647–1652CrossRefGoogle Scholar
  12. [12]
    Carnero, B., Drygajlo, A.: Perceptual Speech Coding and Enhancement Using Frame-Synchronized Fast Wavelet Packet Transform Algorithms. IEEE Trans, on Signal Processing, Vol. 47, no. 6, (1999)Google Scholar
  13. [13]
    Golchin, F., Paliwal, K.K.: Lossless coding of MPEG-1 Layer III encoded audio streams. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2 (2000)Google Scholar
  14. [14]
    Kohonen, T.: Self-Organization and Associative Memory. 2nd edn. Springer-Verlag, Berlin (1987)Google Scholar
  15. [15]
    The Mathworks Inc. (ed.): Neural networks toolbook. (2000)Google Scholar
  16. [16]
    Papamichalis, P.E.: Practical approaches to speech coding. Prentice-Hall, Englewood Cliffs New Jersey (1987)Google Scholar
  17. [17]
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press (1988) 910–915Google Scholar
  18. [18]
    Texas Instruments Inc. (ed.): TMS320C6201/6701 Evaluation Module Technical Reference. Literature Number SPRU305 (1998)Google Scholar
  19. [19]
    Dart, D.: Understanding the Functional Enhancements of DSP/BIOS II and their Utilization in Real-Time DSP Applications. Texas Instruments Application Report, Literature Number SPRA648 (2000)Google Scholar
  20. [20]
    Montuori, A., Quaglia, D.: A Tutorial on Subband Audio Coding Using the TMS320C6211 Starter Kit. Application Report, Texas Instruments DSP Challenge 2000 (2001)Google Scholar
  21. [21]
    Quaglia, D., Montuori, A., De Martin, J. C., Pasero, E.: Interactive DSP Educational Platform for Real-Time Subband Audio Coding. Proceedings of ICASSP 2002, International Conference on Acoustics, Speech, and Signal Processing (2002)Google Scholar
  22. [22]
    Pasero, E., Montuori, A.: Wavelet Based Wideband Speech Coding on the TMS320C67 for Real-Time Transmission. Proceedings of IEEE Multimedia Technology and Applications Conference (2001) 208–212Google Scholar
  23. [23]
    ITU-T: G.722.1. Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Series G: Transmission Systems and Media, Digital Systems and Networks (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Eros Pasero
    • 1
  • Alfonso Montuori
    • 1
  1. 1.Dipartimento di ElettronicaPolitecnico di TorinoTorinoItaly

Personalised recommendations