Waveform Interpolation

  • Jesper Haagen
  • W. Bastiaan Kleijn
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 327)


Waveform interpolation (WI) has proved to be an efficient procedure for high quality coding of speech at low bit rates. In this method, the speech signal is described by a sequence of characteristic waveforms, which are interpolated during reconstruction. Originally, the characteristic waveform was identified with a pitch cycle, and WI was applied to voiced speech segments only. A number of implementations, which use CELP for unvoiced signal segments, showed that the procedure can provide high performance. Recently, the method was extended to include unvoiced speech and background noise. To this purpose the characteristic waveform is decomposed into a slowly evolving waveform (representing the periodic component of the signal), and a rapidly evolving waveform (representing the other components of the signal). The rapidly evolving waveform requires high time resolution and only low quantization accuracy, while the slowly evolving waveform requires less time resolution and a more precise description. With this decomposition, switching between different coding models is avoided, and a robust coding method results.


Speech Signal Excitation Signal Phase Spectrum Speech Code Magnitude Spectrum 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    B. S. Atal and M. R. Schroeder, “Stochastic Coding of Speech at Very Low Bit Rates,” Proc. Int. Conf. Comm., Amsterdam, pp. 1610–1613, 1984.Google Scholar
  2. [2]
    W. B. Kleijn, “Continuous Representations in Linear Predictive Coding,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. 201–204, IEEE, 1991.Google Scholar
  3. [3]
    W. B. Kleijn, “Encoding Speech Using Prototype Waveforms,” IEEE Trans. Speech Audio Process., Vol 1, No. 4, pp. 386–399, 1993.CrossRefGoogle Scholar
  4. [4]
    Y. Shoham, “High-Quality Speech Coding at 2.4 to 4.0 kbps based on Time-Frequency Interpolation,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. II 167–170, IEEE, 1993.Google Scholar
  5. [5]
    J. Haagen, H. Nielsen and S. Duus Hansen, “Improvements in 2.4 kbps High-Quality Speech Coding,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. II145–II148, IEEE, 1992.Google Scholar
  6. [6]
    W. Granzow, B. S. Atal, K. K. Paliwal and J. Schroeter, “Speech coding at 4 kb/s and Lower Using Single-pulse and Stochastic Models of LPC Excitation,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. 217–220, IEEE, 1991.Google Scholar
  7. [7]
    G. Yang, H. Leich, and T. Boite, “Voiced Speech Coding at Very Low Bit Rates based on Forward-Backward Waveform Prediction (FBWP),” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. II179–II182, IEEE 1993.Google Scholar
  8. [8]
    I. S. Burnett, and R. J. Holbeche, “A Mixed Prototype Waveform/CELP Coder for Sub 3 kb/s,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. II175–I178, IEEE 1993.Google Scholar
  9. [9]
    M. Leong, “Representing Voiced Speech Using Prototype Waveform Interpolation for Low-rate Speech Coding,” Master’s Thesis, McGill University, Montreal, 1993.Google Scholar
  10. [10]
    Y. Tanaka and H. Kimura, “Low-bit-rate speech coding using a two-dimensional transform of residual signals and waveform interpolation,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. I173–I176, IEEE, 1994.Google Scholar
  11. [11]
    W. B. Kleijn and J. Haagen, “Transformation and Decomposition of the Speech Signal for Coding,” IEEE Signal Processing Letters, Vol. 1, pp. 136–138, September 1994.CrossRefGoogle Scholar
  12. [12]
    J. Makhoul, “Linear Prediction: A tutorial review”, Proceedings of the IEEE, vol. 63, Apr. 1975.Google Scholar
  13. [13]
    K. K. Paliwal and B. S. Atal, “Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame,” IEEE Trans. Speech Audio Process., Vol. 1, No. 1, pp. 3–14, 1993.CrossRefGoogle Scholar
  14. [14]
    W. F. LeBlanc, B. Bhattacharya, S. A. Mahmoud, and V. Cuperman, “Efficient Search and Design Procedures for Robust Multi-Stage VQ of LPC Parameters for 4 kb/s Speech Coding,” IEEE Trans. Speech Audio Process., Vol. 1, No. 4, pp. 373–385, 1993.CrossRefGoogle Scholar
  15. [15]
    R. Hagen and P. Hedelin, “Robust Vector Quantization in Speech Coding,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. 13–16, IEEE, 1993.Google Scholar
  16. [16]
    W. Hess, “Pitch Determination of Speech Signals,” Springer Verlag, Berlin, 1983.CrossRefGoogle Scholar
  17. [17]
    D. P. Prezas, J. Picone, and D. L. Thomson “Fast and Accurate Pitch Detection Using Pattern Recognition and Adaptive Time-Domain Analysis,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. 109–112, 1986.Google Scholar
  18. [18]
    Y. Medan, E. Yair, and D. Chazan “Super Resolution Pitch Determination of Speech Signals,” IEEE Trans. Signal Process., Vol. 39, No. 1, pp. 40–48, 1991.CrossRefGoogle Scholar
  19. [19]
    W. B. Kleijn, P. Kroon, L. Cellario, and D. Sereno, “A 5.85 kb/s CELP Algorithm for Cellular Applications,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. II596–II598, IEEE 1993.Google Scholar
  20. [20]
    W. B. Kleijn and W. Granzow, “Methods for Waveform Interpolation in Speech Coding,” Digital Signal Processing, Vol 1, No. 4, 1991, pp. 215–230.CrossRefGoogle Scholar
  21. [21]
    M. R. Schroeder, B. S. Atal, and J. L. Hall,“ Optimizing Digital Speech Coders by exploiting Masking Properties of the Human Ear,” The Journal of the Acoustical Society of America, vol. 66, no. 6, pp. 1647–1652, 1979.CrossRefGoogle Scholar
  22. [22]
    D. Sen, D. H. Irving, and W. H. Holmes, “Use of an Auditory Model to improve Speech Coders,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. II411–II414, IEEE, 1993.Google Scholar
  23. [23]
    J. Haagen, “Digital Speech Coding at 2.4 kbit/s: New Strategies for Quantization of the Residual,” Ph.D. thesis, Technical University of Denmark, 1993.Google Scholar
  24. [24]
    W. B. Kleijn and J. Haagen, “A General Waveform Interpolation Structure for Speech Coding,” Signal Processing VII: Theories and Applications, pp. 1665–1668, European Association for Signal Processing, 1994.Google Scholar
  25. [25]
    G. Kubin, B. S. Atal, and W. B. Kleijn, “Performance of Noise Excitation for Unvoiced Speech,” Proc. IEEE Workshop on Speech Coding for Telecommunications, pp. 35–36, 1993.Google Scholar
  26. [26]
    B. S. Atal and B. E. Caspers, “Beyond Multipulse and CELP towards High Quality Speech at 4 kb/s,” Advances in Speech Coding, pp. 191–201, Kluwer Academic Publishers, 1991Google Scholar
  27. [27]
    P. Kroon and B. S. Atal, “Pitch predictors with high temporal resolution,” Proc. Int. Conf. Acoust. Speech Sign. Process., pp. 661–664, IEEE, 1990.Google Scholar
  28. [28]
    J. P. Campbell, V. C. Welch, and T. E. Tremain, The DOD 4.8 kbps Standard (Proposed Federal Standard 1016). In B. S. Atal, V. Cuperman, and A. Gersho, editorsAdvances in Speech Coding pp. 121–133, Kluwer Academic Publishers, Dordrecht, Holland, 1991.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1995

Authors and Affiliations

  • Jesper Haagen
    • 1
  • W. Bastiaan Kleijn
    • 1
  1. 1.AT&T Bell LaboratoriesInformation Principles Research LaboratoryMurray HillUSA

Personalised recommendations