Skip to main content

Homomorphic Systems and Cepstrum Analysis of Speech

  • Chapter
Book cover Springer Handbook of Speech Processing

Part of the book series: Springer Handbooks ((SHB))

Abstract

In 1963, Bogert, Healy, and Tukey published a chapter with one of the most unusual titles to be found in the literature of science and engineering [9.1]. In this chapter, they observed that the logarithm of the power spectrum of a signal plus its echo (delayed and scaled replica) consists of the logarithm of the signal spectrum plus a periodic component due to the echo. They suggested that further spectrum analysis of the log spectrum could highlight the periodic component in the log spectrum and thus lead to a new indicator of the occurrence of an echo. Specifically they made the following observation:

In general, we find ourselves operating on the frequency side in ways customary on the time side and vice versa.

As an aid in formalizing this new point of view, they introduced a number of paraphrased words. For example, they defined the cepstrum of a signal as the power spectrum of the logarithm of the power spectrum of a signal. (In fact, they used discrete-time spectrum estimates based on the discrete Fourier transform.) Similarly, the term quefrency was introduced for the independent variable of the cepstrum [9.1].

In this chapter we will explore why the cepstrum has emerged as a central concept in digital speech processing. We will start with definitions appropriate for discrete-time signal processing and develop some of the general properties and computational approaches for the cepstrum of speech. Using this basis, we will explore the many ways that the cepstrum has been used in speech processing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 579.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 729.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

ASR:

automatic speech recognition

CELP:

code-excited linear prediction

DCT:

discrete cosine transform

DFT:

discrete Fourier transform

DTFT:

discrete-time Fourier transform

DoD:

Department of Defense

FFT:

fast Fourier transform

FIR:

finite impulse response

IDTFT:

inverse discrete-time Fourier transform

LPC:

linear prediction coefficients

LPC:

linear predictive coding

MFCC:

mel-filter cepstral coefficient

VQ:

vector quantization

References

  1. B.P. Bogert, M.J.R. Healy, J.W. Tukey: The quefrency alanysis of times series for echos: cepstrum, pseudo-autocovariance, cross-cepstrum, and saphe cracking, Proc. of the Symposium on Time Series Analysis, ed. by M. Rosenblatt (Wiley, New York 1963)

    Google Scholar 

  2. R.W. Schafer: Echo removal by discrete generalized linear filtering (MIT, Cambridge 1968), Ph.D. dissertation

    Google Scholar 

  3. A.V. Oppenheim, R.W. Schafer, T.G. Stockham Jr.: Nonlinear filtering of multiplied and convolved signals, Proc. IEEE 56(8), 1264-1291 (1968)

    Article  Google Scholar 

  4. A.V. Oppenheim, R.W. Schafer, J.R. Buck: Discrete-Time Signal Processing (Upper Saddle River, Prentice-Hall 1999)

    Google Scholar 

  5. A.V. Oppenheim: Superposition in a Class of Nonlinear Systems (MIT, Cambridge 1964), Ph.D. dissertation, Also: MIT Research Lab. of Electronics, Cambridge, Massachusetts, Technical Report 432

    Google Scholar 

  6. J.M. Tribolet: A new phase unwrapping algorithm, IEEE Trans. Acoust. Speech ASSP-25(2), 170-177 (1977)

    Article  MATH  Google Scholar 

  7. G.A. Sitton, C.S. Burrus, J.W. Fox, S. Treitel: Factoring very-high-degree polynomials, IEEE Signal Proc. Mag. 20(6), 27-42 (2003)

    Article  Google Scholar 

  8. L.R. Rabiner, R.W. Schafer: Digital Processing of Speech Signals (Prentice-Hall, Englewood Cliffs 1978)

    Google Scholar 

  9. A.V. Oppenheim, R.W. Schafer: Homomorphic analysis of speech, IEEE Trans. Audio Electroacoust. AU-16, 221-228 (1968)

    Article  Google Scholar 

  10. G.E. Kopec, A.V. Oppenheim, J.M. Tribolet: Speech analysis by homomorphic prediction, IEEE Trans. Acoust. Speech ASSP-25(1), 40-49 (1977)

    Article  Google Scholar 

  11. A.M. Noll: Cepstrum pitch determination, J. Acoust. Soc. Am. 41(2), 293-309 (1967)

    Article  Google Scholar 

  12. B.S. Atal, S.L. Hanauer: Speech analysis and synthesis by linear prediction of the speech wave, J. Acoust. Soc. Am. 50, 561-580 (1971)

    Article  Google Scholar 

  13. A.V. Oppenheim: A speech analysis-synthesis system based on homomorphic filtering, J. Acoust. Soc. Am. 45(2), 293-309 (1969)

    Article  Google Scholar 

  14. R.W. Schafer, L.R. Rabiner: System for automatic formant analysis of voiced speech, J. Acoust. Soc. Am. 47(2), 458-465 (1970)

    Google Scholar 

  15. B.S. Atal, J. Remde: A new model of LPC exitation for producing natural-sounding speech at low bit rates, Proc. IEEE ICASSP (1982), 614-617

    Google Scholar 

  16. M.R. Schroeder, B.S. Atal: Code-excited linear prediction (CELP): high-quality speech at very low bit rates, Proc. IEEE ICASSP (1985), 937-940

    Google Scholar 

  17. R.C. Rose, T.P. Barnwell III: The self excited vocoder - an alternate approach to toll quality at 4800 bps, Proc. IEEE ICASSP 11, 453-456 (1986)

    Google Scholar 

  18. J.H. Chung, R.W. Schafer: Excitation modeling in a homomorphic vocoder, Proc. IEEE ICASSP 1, 25-28 (1990)

    Google Scholar 

  19. J.H. Chung, R.W. Schafer: Performance evaluation of analysis-by-synthesis homomorphic vocoders, Proc. IEEE ICASSP 2, 117-120 (1992)

    Google Scholar 

  20. B.S. Atal, M.R. Schroeder: Predictive coding of speech signals and subjective error criterion, IEEE Trans. Acoust. Speech ASSP-27, 247-254 (1079)

    Google Scholar 

  21. T.G. Stockham Jr., T.M. Cannon, R.B. Ingebretsen: Blind deconvolution through digital signal processing, Proc. IEEE 63, 678-692 (1975)

    Article  Google Scholar 

  22. S. Furui: Cepstral analysis technique for automatic speaker verification, IEEE Trans. Acoust. Speech ASSP-29(2), 254-272 (1981)

    Article  Google Scholar 

  23. Y. Tohkura: A weighted cepstral distance measure for speech recognition, IEEE Trans. Acoust. Speech ASSP-35(10), 1414-1422 (1987)

    Article  Google Scholar 

  24. B.-H. Juang, L.R. Rabiner, J.G. Wilpon: On the use of bandpass liftering in speech recognition, IEEE Trans. Acoust. Speech ASSP-35(7), 947-954 (1987)

    Article  Google Scholar 

  25. F. Itakura, T. Umezaki: Distance measure for speech recognition based on the smoothed group delay spectrum, Proc. IEEE ICASSP 12, 1257-1260 (1987)

    Google Scholar 

  26. S.B. Davis, P. Mermelstein: Comparison of parametric representations for monosyllabic word recognition in continously spoken sentences, IEEE Trans. Acoust. Speech ASSP-28(4), 357-366 (1980)

    Article  Google Scholar 

  27. P.D. Smith, M. Kucic, R. Ellis, P. Hasler, D.V. Anderson: Mel-frequency cepstrum encoding in analog floating-gate circuitry, Proc. ISCAS 2002(4), 671-674 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ronald W. Schafer Prof. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schafer, R.W. (2008). Homomorphic Systems and Cepstrum Analysis of Speech. In: Benesty, J., Sondhi, M.M., Huang, Y.A. (eds) Springer Handbook of Speech Processing. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49127-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-49127-9_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49125-5

  • Online ISBN: 978-3-540-49127-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics