Abstract
Efforts to bridge the gap between waveform coders and vocoders has led to a new class of hybrid speech coders. These coders perform analysis-by-synthesis encoding of an excitation signal and reconstruct speech from the coded excitation signal and a quantized time-varying filter model of speech production. Most notable of these coders are those which use vector quantization to code the excitation signal as a sequence of vectors. The coding technique is called Code Excited Linear Prediction (CELP) [1], or Vector Excitation Coding (VXC) [2]. VXC coders result in coded speech with a waveform approximating the original and are able to achieve a satisfactory, natural-sounding quality at bit rates as low as 4.8 kb/s. When the bitrate is reduced below 4.8 kb/s, the quality of VXC coders degrades rapidly and becomes inferior to the synthetic quality of an LPC vocoder operating at 2.4 kb/s. There remains then the challenging problem to find an algorithm that at 2.4 kb/s (or even at 3.6 kb/s) will achieve the quality that VXC offers at 4.8 kb/s
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
M. R. Schroeder and B. S. Atal, “Code-Excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 937–940, Tampa, March 1985.
G. Davidson and A. Gersho, “Complexity Reduction Methods for Vector Excitation Coding,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3055–3058, Tokyo, Japan, April 1986.
P. Kroon and B. S. Atal, “Strategies for Improving the Performance of CELP Coders at Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 151–154, New York City, April 1988.
Mei Yong and Allen Gersho, “Vector Excitation Coding with Dynamic Bit Allocation,” Proceedings of IEEE International Conference on Communication, vol. 1, pp. 0290–0294, Florida, November 1988.
N. S. Jayant and J. H. Chen, “Speech Coding with Time-Varying Bit Allocation to Excitation and LPC Parameters,” Proc. IEEE Conf. Acoust., Speech, Sign. Processing, vol. 1, pp. 65–68, May 1989.
T. Taniguchi, S. Unagami, and R. Gray, “Multimode Coding: Application to CELP,” Proc. IEEE Conf. Acoust., Speech, Sign. Processing, vol. 1, pp. 156–159, May 1989.
S. Roucos, R. M. Schwartz, and J. Makhoul, “A Segment Vocoder at 150 b/s,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 61–64, Boston, April 1983.
Maurizio Copperi, “Rule-Based Speech Analysis and Application to CELP Coding,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 143–146, New York City, April 1988.
Shigeru Ono and Kazunori Ozawa, “2.4 Kbps Pitch Prediction Multi-pulse Speech Coding,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 175–178, New York City, April 1988.
Shihua Wang and Allen Gersho, “Phonetically-Based Vector Excitation Coding of Speech at 3.6 kbit/s,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Glasgow, May 1989.
Osamu Fujimura and Kazunori Ozawa, “High-quality Speech Coding Using Multiple Types of Excitation Signals at 4.8 kb/s and Below,” Advances in Speech Coding, Kluwer Academic Publishers, 1990.
T. Liu and H. Hoege, “Phonetically-based LPC vector quantization of high quality speech,” Eurospeech 89, section 39.4, Paris, September 89.
G. Davidson, M. Yong, and A. Gersho, “Real-Time Vector Excitation Coding of Speech At 4800 bps,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2189–2192, Dallas, April 1987.
M. Yong, G. Davidson, and A. Gersho, “Encoding of LPC Spectral Parameters Using Switched-Adaptive Interframe Vector Prediction,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 402–405, New York City, April 1988.
S. Singhal and B. S. Atal, “Improving Performance of Multi-Pulse LPC coders at Low Bit Rates,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1.3.1–1.3.4, San Diego, 1984.
Mei Yong and Allen Gersho, “Efficient Encoding of the Long-term Predictor in Vector Excitation Coders,” Advances in Speech Coding, Kluwer Academic Publishers, 1990.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1991 Springer Science+Business Media New York
About this chapter
Cite this chapter
Wang, S., Gersho, A. (1991). Phonetic Segmentation for Low Rate Speech Coding. In: Atal, B.S., Cuperman, V., Gersho, A. (eds) Advances in Speech Coding. The Springer International Series in Engineering and Computer Science, vol 114. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-3266-8_22
Download citation
DOI: https://doi.org/10.1007/978-1-4615-3266-8_22
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-6437-5
Online ISBN: 978-1-4615-3266-8
eBook Packages: Springer Book Archive