Advertisement

Recent Advances in Speech Coding

  • D. Wolf
  • H. Reininger
Part of the NATO ASI Series book series (volume 46)

Abstract

After a short summary of some basic properties of speech signals and of speech signal models the effect of linear prediction and vector quantization for data compression in speech coding is outlined. Some well-known coding schemes are reviewed. The recently developed RELP-S schemes based on speech analysis by synthesis are discussed in more detail. In particular a scheme using stochastic excitation sequences is expected to guarantee high speech quality at data rates far below 8 kb/s.

Keywords

Speech Signal Code Scheme Vector Quantization Linear Prediction Mean Opinion Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Atal, B.S., “Predictive Coding of Speech at Low Bit Rates”, IEEE Trans. on Communications, COM-30, (1986) pp. 600–614.Google Scholar
  2. [2]
    Atal, B.S., and Rabiner, L.R., “Speech Research Directions”, AT&T Techn. Journal 65, (1986) pp. 75–88.Google Scholar
  3. [3]
    Atal, R., and Remde, J.R., “A New Model of LPC Excitation for Producing Natural-Sounding Speech at Low Bit Rates”, Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Paris 1982, pp. 614–617.Google Scholar
  4. [4]
    Brehm, H., and Stammler, W., “Description and Generation of Spherically Invariant Speech-Model Signals”, Signal Processing 12, (1987) pp. 119–141.CrossRefGoogle Scholar
  5. [5]
    Buzo, A., Gray, H., Gray, R.M.,and Markel, J.D., “Speech Coding Based upon VectorQuantization”, IEEE A., “Adaptive Differential Conference Record Globe-ComGoogle Scholar
  6. [6]
    Caspers, B., and Atal, B.S., “Role of Multi-Pulse Excitation in Synthesis of Natural-Sounding Voiced Speech”, Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Dallas 1987, pp. 2388–2394.Google Scholar
  7. [7]
    Cheng, D.Y., Gersho, A., Ramamurthi, B., and Shoham, Y., “Fast Search Algorithms for Vector Quantization and Pattern Matching”, Proc. Int. Conf. Acoust., Speech, and Signal Processing, San Diego (CA) 1984, pp. 9.11.1–9.11.4.Google Scholar
  8. [8]
    Cuperman, V., and Gersho, A Adaptive Differential Vector Coding of Speech”,Conference Record Globe-com 82, (1982) pp. 1092–1096.Google Scholar
  9. [9]
    Flanagan, J.L.f “Speech Analysis, Synthesis, and Perception”, Springer-Verlag Berlin, Heidelberg, New York 1972.Google Scholar
  10. [10]
    Flanagan, J.L., et al., “Speech Coding”, IEEE Trans, on Communications, COM-27, (1979) pp. 710–736.Google Scholar
  11. [11]
    Gersho, A., “On the Structure of Vector Quantizers”, IEEE Trans. Inform. Theory, IT-28, (1982) pp.157–166.Google Scholar
  12. [12]
    Gray, R.M., and Karnin, E.D., “Multiple Local Optima in Vector Quantizers”, IEEE Trans. Inform. Theory, IT-28, (1982) pp. 256–261.Google Scholar
  13. [13]
    Gray, R.M., “Vector Quantization”, IEEE ASSP Magazine 1, (1984) pp. 4–29.CrossRefGoogle Scholar
  14. [14]
    Guth, P., Reininger, H., und Wolf, D., “Zur Vektorquantisierung der Pradiktorparameter”, Kleinheubacher Berichte 29, (1986) pp. 91–94.Google Scholar
  15. [15]
    Itakura, F., and Saito, S., “Analysis Synthesis Telephony Based upon the Maximum Likelihood Method”, Reports on the 6th Int. Cong. Acoust., ed. by Y. Kohasi, Tokyo, (1968) pp. C-5-5 – C17-20.Google Scholar
  16. [16]
    Jayant, N.S., “Coding Speech at Low Bit Rates”, IEEE Spectrum 23, (1986) pp. 58–63.Google Scholar
  17. [17]
    Jayant, N.S., and Noll, P., “Digital Coding of Waveforms”, Prentice Hall, Inc., Englewood Cliffs, New Jersey 1984.Google Scholar
  18. [18]
    Kroon, P., and Atal, B.S., “Quantization Procedures for the Excitation in CELP Coders”, Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Dallas 1987, pp. 1649–1652.22Google Scholar
  19. [19]
    Kroon, P., Deprettere, E.F., and Sluyter, R.J., “Regu- lar-Pulse Excitation - A Novel Approach to Effective and Efficient Coding of Speech”, IEEE Trans, on Acoust., Speech, and Signal Processing, ASSP 34, (1986) pp. 1054–1063.CrossRefGoogle Scholar
  20. [20]
    Linde, Y., Buzo, A., and Gray, R.M., “An Algorithm for Vector Quantizer Design”, IEEE Trans, on Communications, COM-28, (1980) pp.84–95.Google Scholar
  21. [21]
    Makhoul, J., Roucos, S., and Gish, H., “Vector Quantization in Speech Coding”, Proc. IEEE, 73, (1985) pp. 1551–1588.CrossRefGoogle Scholar
  22. [22]
    Makhoul, J., “Linear Prediction: A Tutorial Review”, Proc. IEEE, 63, (1975) pp. 561–580.CrossRefGoogle Scholar
  23. [23]
    Marke1, J.D., and Gray Jr., A.H., “Linear Prediction of Speech”, Springer-Verlag, Berlin, Heidelberg, New York 1976.Google Scholar
  24. [24]
    Rabiner, L.R., and Schafer, R.W., “Digital Processing of Speech Signals”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey 1978.Google Scholar
  25. [25]
    Ramachandran, R.P., and Kabal, P., “Stability and Performance Analysis of Pitch Filters in Speech Coders”, IEEE Trans. Acoust., Speech, and Signal Processing, ASSP-35, (1987) pp. 937–946.CrossRefGoogle Scholar
  26. [26]
    Reininger, H., “Prinzipien der digitalen Sprachcodierung und ihre Anwendung zur Sprachübertragung über Fadingkanäle bei mittleren Datenraten”, Dissertation, Institut für Angewandte Physik, Universität Frankfurt am Main, 1987.Google Scholar
  27. [27]
    Reininger, H., and Wolf, D., “Fast Search Algorithms for Speech Coding Schemes Using Vector Quantization”, Signal Processing III: Theories and Applications, North Holland, Amsterdam 1986, pp. 453–456.Google Scholar
  28. [28]
    Schroeder, M.R., and Atal, B.S., “Code-Excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates”, Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Tampa 1985, pp. 937–940.Google Scholar
  29. [29]
    Wolf, D., “Speech Coding”, Proc. Zurich Seminar on Digital Communications, (1984) pp. 1–5.Google Scholar
  30. [30]
    Wolf, D., “Statistical Models of Speech”, NTG-FachBerichte 65, (1978) pp. 1–9.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1988

Authors and Affiliations

  • D. Wolf
    • 1
  • H. Reininger
    • 1
  1. 1.Institut für Angewandte PhysikUniversität Frankfurt a.M.Frankfurt a. M.Germany

Personalised recommendations