Skip to main content

Phonetic Segmentation for Low Rate Speech Coding

  • Chapter

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 114))

Abstract

Efforts to bridge the gap between waveform coders and vocoders has led to a new class of hybrid speech coders. These coders perform analysis-by-synthesis encoding of an excitation signal and reconstruct speech from the coded excitation signal and a quantized time-varying filter model of speech production. Most notable of these coders are those which use vector quantization to code the excitation signal as a sequence of vectors. The coding technique is called Code Excited Linear Prediction (CELP) [1], or Vector Excitation Coding (VXC) [2]. VXC coders result in coded speech with a waveform approximating the original and are able to achieve a satisfactory, natural-sounding quality at bit rates as low as 4.8 kb/s. When the bitrate is reduced below 4.8 kb/s, the quality of VXC coders degrades rapidly and becomes inferior to the synthetic quality of an LPC vocoder operating at 2.4 kb/s. There remains then the challenging problem to find an algorithm that at 2.4 kb/s (or even at 3.6 kb/s) will achieve the quality that VXC offers at 4.8 kb/s

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. R. Schroeder and B. S. Atal, “Code-Excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 937–940, Tampa, March 1985.

    Google Scholar 

  2. G. Davidson and A. Gersho, “Complexity Reduction Methods for Vector Excitation Coding,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3055–3058, Tokyo, Japan, April 1986.

    Google Scholar 

  3. P. Kroon and B. S. Atal, “Strategies for Improving the Performance of CELP Coders at Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 151–154, New York City, April 1988.

    Google Scholar 

  4. Mei Yong and Allen Gersho, “Vector Excitation Coding with Dynamic Bit Allocation,” Proceedings of IEEE International Conference on Communication, vol. 1, pp. 0290–0294, Florida, November 1988.

    Google Scholar 

  5. N. S. Jayant and J. H. Chen, “Speech Coding with Time-Varying Bit Allocation to Excitation and LPC Parameters,” Proc. IEEE Conf. Acoust., Speech, Sign. Processing, vol. 1, pp. 65–68, May 1989.

    Google Scholar 

  6. T. Taniguchi, S. Unagami, and R. Gray, “Multimode Coding: Application to CELP,” Proc. IEEE Conf. Acoust., Speech, Sign. Processing, vol. 1, pp. 156–159, May 1989.

    Google Scholar 

  7. S. Roucos, R. M. Schwartz, and J. Makhoul, “A Segment Vocoder at 150 b/s,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 61–64, Boston, April 1983.

    Google Scholar 

  8. Maurizio Copperi, “Rule-Based Speech Analysis and Application to CELP Coding,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 143–146, New York City, April 1988.

    Google Scholar 

  9. Shigeru Ono and Kazunori Ozawa, “2.4 Kbps Pitch Prediction Multi-pulse Speech Coding,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 175–178, New York City, April 1988.

    Google Scholar 

  10. Shihua Wang and Allen Gersho, “Phonetically-Based Vector Excitation Coding of Speech at 3.6 kbit/s,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Glasgow, May 1989.

    Google Scholar 

  11. Osamu Fujimura and Kazunori Ozawa, “High-quality Speech Coding Using Multiple Types of Excitation Signals at 4.8 kb/s and Below,” Advances in Speech Coding, Kluwer Academic Publishers, 1990.

    Google Scholar 

  12. T. Liu and H. Hoege, “Phonetically-based LPC vector quantization of high quality speech,” Eurospeech 89, section 39.4, Paris, September 89.

    Google Scholar 

  13. G. Davidson, M. Yong, and A. Gersho, “Real-Time Vector Excitation Coding of Speech At 4800 bps,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2189–2192, Dallas, April 1987.

    Google Scholar 

  14. M. Yong, G. Davidson, and A. Gersho, “Encoding of LPC Spectral Parameters Using Switched-Adaptive Interframe Vector Prediction,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 402–405, New York City, April 1988.

    Google Scholar 

  15. S. Singhal and B. S. Atal, “Improving Performance of Multi-Pulse LPC coders at Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1.3.1–1.3.4, San Diego, 1984.

    Google Scholar 

  16. Mei Yong and Allen Gersho, “Efficient Encoding of the Long-term Predictor in Vector Excitation Coders,” Advances in Speech Coding, Kluwer Academic Publishers, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer Science+Business Media New York

About this chapter

Cite this chapter

Wang, S., Gersho, A. (1991). Phonetic Segmentation for Low Rate Speech Coding. In: Atal, B.S., Cuperman, V., Gersho, A. (eds) Advances in Speech Coding. The Springer International Series in Engineering and Computer Science, vol 114. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-3266-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-3266-8_22

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-6437-5

  • Online ISBN: 978-1-4615-3266-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics