Abstract
We describe five types of speech coding systems with transmission bit rates spanning the range from 16,000 bits/sec (b/s) down to 100 b/s: adaptive predictive coders at 16 kb/s, baseband coders at 9.6 kb/s, linear predictive coders at 1.5–2.4 kb/s, clustering vocoders at 600–800 b/s, and diphone-based phonetic vocoders at 100 b/s. For each type of coders, we describe the coder configuration, discuss the important coding issues, and present the results available to date.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Makhoul, R. Viswanathan, R. Schwartz and A.W.F. Huggins, “A Mixed-Source Model for Speech Compression and Synthesis,” J. Acoust. Soc. Amer., Vol. 64, Dec. 1978, pp. 1577–1581.
A.W.F. Huggins, R. Schwartz, R. Viswanathan and J. Makhoul, “Subjective Quality Testing of a New Source Model of LPC Vocoders,” Presented at the 96th Meeting of the Acoust. Soc. Amer., Honolulu, Hawaii, Nov. 27-Dec. 1, 1978.
J. Makhoul, “Linear Prediction: A Tutorial Review,” Proc. IEEE, Vol. 63, April 1975, pp. 561–580.
J. Makhoul and M. Berouti, “Adaptive Noise Spectral Shaping and Entropy Coding in Predictive Coding of Speech,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-27, Feb. 1979, PP. 63–73.
B.S. Atal and M.R. Schroeder, “Predictive Coding of Speech Signals and Subjective Error Criteria,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-27, June 1979, PP. 247–254.
H. Viswanathan, W. Russell, and A. Higgins,, “Design and Real-Time Implementation of a Robust APC Coder for Speech Transmission over 16 kb/s Noisy Channels,” BBN Report No. 4565, Vol. I: Algorithm Design and Simulation, AD No. A096091, Final Report, Contract DCA100-79-C-0037, Dec. 1980.
R. Viswanathan, W. Russell, A. Higgins, M. Berouti and J. Makhoul, “Speech-Quality Optimization of 16 kb/s Adaptive Predictive Coders,” IEEE International Conf. Acoustics, Speech and Signal Processing, Denver, CO, April 1980, pp. 520-525.
R. Viswanathan, W. Russell, and A. Higgins, “Noisy Channel Performance of 16 kb/s APC Coders,” IEEE International Conf. Acoustics, Speech and Signal Processing, Atlanta, GA, April 1981, pp. 615-618.
J. Wolf and K. Field, “Real-Time Speech Coder Implementation at 9.6 and 16 kb/s,” IEEE International Conf. Acoustics, Speech and Signal Processing, Atlanta, GA, April 1981, pp. 607-610, (A revised version has been accepted for publication in the April 1982 issue of IEEE Trans. Communication.).
R. Viswanathan, J. Wolf, L. Cosell, K. Field, A. Higgins, and W. Russell, “Design and Real-Time Implementation of a Baseband LPC Coder for Speech Transmission Over 9600 Bps Noisy Channels,” Final Report, Contract No. DCA100-79-0003, Bolt Beranek and Newman Inc., BBN Report No. 4327, Vol I: ADA083079, Vol II: ADA083238, Feb. 1980.
R. Viswanathan, W. Russell, and J. Makhoul, “Voice-Excited LPC Coders for 9.6 Kbps Speech Transmission,” IEEE International Conf. Acoustics, Speech and Signal Processing, Washington, DC, April 1979, pp. 568-571.
R. Viswanathan, A. Higgins, W. Russell and J. Makhoul, “Baseband LPC Coders for Speech Transmission Over 9.6 kb/s Noisy Channels,” IEEI International Conf. Acoustics, Speech and Signal Processing, Denver, CO, April 1980, pp. 348-351.
J. Makhoul and M. Berouti, “High-Frequency Regeneration in Speech Coding Systems,” IEEE International Conf. Acoustics. Speech and Signal Processing, Washington, DC, April 1979, PP. 428-432.
J. Makhoul and M. Berouti, “Predictive and Residual Encoding of Speech,” J. Acoust. Soc. Amer., Dec. 1979, pp. 1633-1641.
A. Higgins, R. Viswanathan, and W. Russell, “New High-Frequency Regeneration (HFR) Techniques for Voice-Excited Speech Coders,” J. Acoust. Soc. Amer., Vol. 66, Nov. 1979, pp. S22 (Abstract).
R. Viswanathan and J. Makhoul, “Quantization Properties of Transmission Parameters in Linear Predictive Systems,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-23. June 1975, pp. 309–321.
R. Viswanathan, J. Makhoul and A.W.F. Huggins, “Speech Compression and Evaluation,” Final Report, Contract No. MDA903-75-C-0180, Bolt Beranek and Newman Inc., BBN Report No. 3794, ADA055019, April 1978.
J. Makhoul, R. Viswanathan, L. Cosell and W. Russell, “Natural Communications with Computers: Speech Compression Research at BBN,” Bolt Beranek and Newman Inc., Report No. 2976, Vol. II, Dec. 1974.
T.E. Tremain. J.W. Fussell, R.A. Dean, B.M. Abzug, M.D. Cowing and P.W. Boudra, Jr., “Implementation of Two Real-Time Narrowband Speech Algorithms,” Proc. EASCON’ 78, Washington, D.C., September 1978, pp. 698-708.
R. Viswanathan, J. Makhoul, R. Schwartz, and A.W.F. Huggins, “Variable-Frame-Rate Transmission: A Review of the Methodology and Application to Narrowband LPC Speech Coding,” Accepted for publication in IEEE Trans. Comm., April 1982.
R. Viswanathan, J. Makhoul and R. Wicke, “The Application of a Functional Perceptual Model of Speech to Variable-Rate LPC Systems,” IEEE International Conf. Acoustics, Speech and Signal Processing, Hartford, CT, May 1977, pp. 219-222.
R. Viswanathan, E. Blackman and J. Makhoul, “Variable Frame Rate Narrowband Speech Transmission over Fixed Rate Noisy Channels.” EASCON’ 77 Record, Sept. 1977, p. 23-24.
E. Blackman, R. Viswanathan, W. Russell and J. Makhoul, “Narrowband LPC Speech Transmission over Noisy Channels.” IEEE International Conf. Acoustics, Speech and Signal Processing, Washington, D.C., April 1979, PP. 60-63.
C.P. Smith, “Perception of Vocoder Speech Processed by Pattern Matching,” J. Acoust. Soc. Amer., Vol. 46, No. 6, (Part 2) 1969, PP. 1562–1571.
A. Buzo, A.H. Gray Jr., R.M. Gray, and J.D. Markel, “Speech Coding Based Upon Vector Quantization,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-28, Oct. 1980, pp. 562–574.
J. Makhoul, C. Cook, R. Schwartz, and D. Klatt, “A Feasibility Study of a Very Low Rate Speech Compression System,” Report No. 3508 (NTIS NO. AD A044400), Bolt Beranek and Newman Inc., Cambridge, MA, Feb. 1977.
R. Schwartz, J. Klovstad, J. Makhoul, D. Klatt and V. Zue, “Diphone Synthesis for Phonetic Vocoding,” IEEE International Conf. Acqutics, Speech and Signal Processing. Washington, DC, April 1979, pp. 891-894.
R. Schwartz, J. Klovstad, J. Makhoul, and J. Sorensen, “A Preliminary Design of a Phonetic Vocoder Based on a Diphone Model,” IPEE International Conf. Acoustics, Speech and Signal Processing, Denver, CO, April 1980, pp. 32-35.
M. Berouti and J. Makhoul, “An Embedded-Code Multirate Speech Transform Coder,” Proc. 1980 Int. Conf. Acoustics, Speech, and Signal Processing. Denver, CO, April 1980, pp. 356-359.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1982 D. Reidel Publishing Company, Dordrecht, Holland
About this paper
Cite this paper
Viswanathan, V.R., Makhoul, J., Schwartz, R. (1982). Medium and Low Bit Rate Speech Transmission. In: Haton, JP. (eds) Automatic Speech Analysis and Recognition. NATO Advanced Study Institutes Series, vol 88. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-7879-9_2
Download citation
DOI: https://doi.org/10.1007/978-94-009-7879-9_2
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-009-7881-2
Online ISBN: 978-94-009-7879-9
eBook Packages: Springer Book Archive