Skip to main content
Log in

On Rationally DSP Implementation of the MP-MLQ/ACELP Dual Rate Speech Encoder for Multimedia Communications

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The excellent performance in communications quality speech coding below 8 kbps achievable with the code-excited linear prediction (CELP) coders gives to this architecture a predominant role in medium-rate and low-rate speech coding, as evidenced by the adoption of several recent fixed-rate and variable-rate standards. Unfortunately, some of these CELP-based schemes are not completely described in the literature, and consequently they are difficult to understand and implement efficiently. This paper presents an original study of the G723.1 codec. The G723.1 encoder is dedicated to compress the voice signals with bandwidth up to 4 kHz efficiently and to deliver an encoded data stream with a very low binary rate and a good quality of transmitted speech (typical applications being encoding of the vocal signal for video conferences via GSTN and Voice over IP). We perform a detailed and gradually analysis, describing the MP-MLQ/ACELP speech coder from the point of view of a classical CELP structure. This approach allows us to identify (using theoretical considerations) the starting internal structure of each processing block from the encoder scheme. These results are used in breaking the main encoding algorithm loop. Finally, using the previously revealed starting internal structure, we derive the algorithm for the pitch predictor block, which is one of the most difficult parts of the ITU-T G723.1 encoder. The accompanying comments, explanations and diagrams allow efficient implementation and debugging of the corresponding software by regular DSP programmers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Atal, B.S., Cuperman, V., and Gersho, A. (1991). Advances in Speech Coding. Boston: Kluwer.

    Google Scholar 

  • Atal, B.S. and Remde, J.R. (1982).Anewmodel ofLPCexcitation for producing natural-sounding speech at low bit rates. Proceedings of ICASSP'82, pp. 614–617.

  • Boite, R., Bourlard, H., Dutoit, T., Hancq, J., and Leich, H. (2000). Traitement de la parole. Laussane: Les Presses Polytechniques et Universitaires Romandes.

    Google Scholar 

  • Deller, Jr., J.R., Proakis, J.G., and Hansen, J.H.L. (1993). Discrete-Time Processing of Speech Signals. New York: Macmillan Publishing Company.

    Google Scholar 

  • Haykin, S. (1996). Adaptive Filter Theory, 3rd edn. NJ: Prentice-Hall.

    Google Scholar 

  • Hersent, O. and Gurle, D. (2000). IP Telephony. Packet Based Multimedia Communications Systems. Harlow: Addison-Wesley.

    Google Scholar 

  • International Telecommunication Union. (1996). ITU-T G723.1- Dual Rate Speech Coder For Multimedia Communications Transmitting at 5.3 and 6.3 kbits/s.

  • Kroon, P. and Atal, B.S. (1991). On improving the performance of pitch predictors in speech coding systems. In B.S. Atal, V. Cuperman, and A. Gersho (Eds.), Advances in Speech Coding. Boston: Kluwer, pp. 321–327.

    Google Scholar 

  • Markovic, M. (2001). Advances in speech compression. Tutorials of ICT2001, IEEE, International Conference on Telecommunications, Bucharest, pp. 335–369.

  • McClellan, S. and Gibson, D. (1996). Lag-indexedVQfor pitch filter coding. Proceedings of ICASSP'96, pp. 236–239.

  • Paliwal, K. and Atal, B.S. (1993). Efficient vector quantization of LPC parameters at 24 bits/frame. Proceedings of IEEE Transactions on Speech and Audio Processing, 1: 3–14.

    Google Scholar 

  • Rabiner, L.R. and Schafer, R.W. (1978). Digital Processing of Speech Signals, Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Ramirez, M.A. and Gerken, M. (1999). A multistage search of algebraic CELP codebooks. CD-ROM Proceedings of ICASSP'99.

  • Schroeder, M.R. and Atal, B.S. (1985). Code-excited linear prediction (CELP). Proceedings of ICASSP'85, pp. 937–940.

  • Stanomir, D., Negrescu, C., and Jalbă, L. (1998). Algorithms for Speech Processing (Algoritmi pentru prelucrarea semnalului vocal). Bucharest: Editura Athena.

    Google Scholar 

  • Veeneman, D. and Mazor, B. (1993). Efficient multi-tap pitch prediction for stochastic coding. In B.S. Atal, V. Cuperman, and A. Gersho (Eds.), Speech and Audio Coding for Wireless Networks. Boston: Kluwer, pp. 225–229.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Negrescu, C., Stanomir, D. & Burileanu, D. On Rationally DSP Implementation of the MP-MLQ/ACELP Dual Rate Speech Encoder for Multimedia Communications. International Journal of Speech Technology 5, 281–300 (2002). https://doi.org/10.1023/A:1020253109447

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020253109447

Navigation