Skip to main content

Part of the book series: Springer Handbooks ((SHB))

Abstract

The existence of noise is inevitable. In all applications that are related to voice and speech, from sound recording, telecommunications, and telecollaborations, to human-machine interfaces, the signal of interest that is picked up by a microphone is generally contaminated by noise. As a result, the microphone signal has to be cleaned up with digital signal-processing tools before it is stored, analyzed, transmitted, or played out. The cleaning process, which is often referred to as either noise reduction or speech enhancement, has attracted a considerable amount of research and engineering attention for several decades. Remarkable advances have already been made, and this area is continuing to progress, with the aim of creating processors that can extract the desired speech signal as if there is no noise. This chapter presents a methodical overview of the state of the art of noise-reduction algorithms. Based on their theoretical origin, the algorithms are categorized into three fundamental classes: filtering techniques, spectral restoration, and model-based methods. We outline the basic ideas underlying these approaches, discuss their characteristics, explain their intrinsic relationships, and review their advantages and disadvantages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 579.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 729.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

AR:

autoregressive

CE:

categorical estimation

CRLB:

Cramèr-Rao lower bound

DFT:

discrete Fourier transform

EM:

estimate-maximize

FFT:

fast Fourier transform

FIR:

finite impulse response

HMM:

hidden Markov models

HNM:

harmonic-plus-noise model

IDFT:

inverse DFT

LP:

linear prediction

LPC:

linear predictive coding

MAP:

maximum a posteriori

ML:

maximum-likelihood

MMSE:

minimum mean-square error

MOS:

mean opinion score

MSE:

mean-square error

PSD:

power spectral density

SNR:

signal-to-noise ratio

STFT:

short-time Fourier transform

References

  1. J. Benesty, S. Makino, J. Chen (Eds.): Speech Enhancement (Springer, Berlin, Heidelberg 2005)

    Google Scholar 

  2. D.H. Johnson, D.E. Dudgeon: Array Signal Processing: Concepts and Techniques (Prentice Hall, Upper Saddle River 1993)

    MATH  Google Scholar 

  3. M. Brandstein, D. Ward (Eds.): Microphone Arrays: Signal Processing Techniques and Applications (Springer, Berlin, Heidelberg 2001)

    Google Scholar 

  4. Y. Huang, J. Benesty (Eds.): Audio Signal Processing for Next-Generation Multimedia Communication Systems (Kluwer Academic, Boston 2004)

    Google Scholar 

  5. B. Widrow, J.R. Glover, J.M. McCool, J. Kaunitz, C.S. Williams, R.H. Hearn, J.R. Zeidler, E. Dong, R.C. Goodwin: Adaptive noise canceling: principles and applications, Proc. IEEE 63, 1692-1716 (1975)

    Article  Google Scholar 

  6. B. Widrow, S.D. Stearns: Adaptive Signal Processing (Prentice Hall, Englewood Cliffs 1985)

    MATH  Google Scholar 

  7. M.M. Goulding, J.S. Bird: Speech enhancement for mobile telephony, IEEE Trans. Veh. Technol. 39, 316-326 (1990)

    Article  Google Scholar 

  8. H.J. Kushner: On closed-loop adaptive noise cancellation, IEEE Trans. Automat. Contr. 43, 1103-1107 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  9. A.S. Abutaled: An adaptive filter for noise canceling, IEEE Trans. Circuits Syst. 35, 1201-1209 (1998)

    Article  Google Scholar 

  10. M. R. Schroeder: U.S. Patent No. 3180936, filed Dec. 1, 1960, issued Apr. 27, 1965

    Google Scholar 

  11. M. R. Schroeder: U.S. Patent No. 3403224, filed May 28, 1965, issued Sept. 24, 1968

    Google Scholar 

  12. S.F. Boll: Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust. Speech Signal Process. ASSP-27, 113-120 (1979)

    Article  Google Scholar 

  13. J.S. Lim, A.V. Oppenheim: Enhancement and bandwidth compression of noisy speech, Proc. IEEE 67, 1586-1604 (1979)

    Article  Google Scholar 

  14. J.S. Lim (Ed.): Speech Enhancement (Prentice Hall, Englewood Cliffs 1983)

    Google Scholar 

  15. P. Vary: Noise suppression by spectral magnitude estimation-mechanism and theoretical limits, Signal Process. 8, 387-400 (1985)

    Article  Google Scholar 

  16. R. Martin: Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio Process. 9, 504-512 (2001)

    Article  Google Scholar 

  17. W. Etter, G.S. Moschytz: Noise reduction by noise-adaptive spectral magnitude expansion, J. Audio Eng. Soc. 42, 341-349 (1994)

    Google Scholar 

  18. J. Chen, J. Benesty, Y. Huang, S. Doclo: New insights into the noise reduction Wiener filter, IEEE Trans. Speech Audio Process. 14, 1218-1234 (2006)

    Article  Google Scholar 

  19. Y. Ephraim, H.L. Van Trees: A signal subspace approach for speech enhancement, IEEE Trans. Speech Audio Process. 3, 251-266 (1995)

    Article  Google Scholar 

  20. M. Dendrinos, S. Bakamidis, G. Garayannis: Speech enhancement from noise: A regenerative approach, Speech Commun. 10, 45-57 (1991)

    Article  Google Scholar 

  21. P.S.K. Hansen: Signal Subspace Methods for Speech Enhancement, Ph.D. Dissertation (Tech. Univ. Denmark, Lyngby 1997)

    Google Scholar 

  22. S.H. Jensen, P.C. Hansen, S.D. Hansen, J.A. Sørensen: Reduction of broad-band noise in speech by truncated QSVD, IEEE Trans. Speech Audio Process. 3, 439-448 (1995)

    Article  MATH  Google Scholar 

  23. H. Lev-Ari, Y. Ephraim: Extension of the signal subspace speech enhancement approach to colored noise, IEEE Trans. Speech Audio Process. 10, 104-106 (2003)

    Google Scholar 

  24. A. Rezayee, S. Gazor: An adaptive KLT approach for speech enhancement, IEEE Trans. Speech Audio Process. 9, 87-95 (2001)

    Article  Google Scholar 

  25. U. Mittal, N. Phamdo: Signal/noise KLT based approach for enhancing speech degraded by colored noise, IEEE Trans. Speech Audio Process. 8, 159-167 (2000)

    Article  Google Scholar 

  26. Y. Hu, P.C. Loizou: A generalized subspace approach for enhancing spech corrupted by colored noise, IEEE Trans. Speech Audio Process. 11, 334-341 (2003)

    Article  Google Scholar 

  27. Y. Ephraim, D. Malah: Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process. 32, 1109-1121 (1984)

    Article  Google Scholar 

  28. Y. Ephraim, D. Malah: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process. ASSP-33, 443-445 (1985)

    Article  Google Scholar 

  29. R.J. McAulay, M.L. Malpass: Speech enhancement using a soft-decision noise suppression filter, IEEE Trans. Acoust. Speech Signal Process. 28, 137-145 (1980)

    Article  Google Scholar 

  30. P.J. Wolfe, S.J. Godsill: Simple alternatives to the Ephraim and Malah suppression rule for speech ehancemnet, Proc. IEEE ICASSP 2001, 496-499 (2001)

    Google Scholar 

  31. K.K. Paliwal, A. Basu: A speech enhancement method based on Kalman filtering, Proc. IEEE ICASSP 1987, 177-180 (1987)

    Google Scholar 

  32. J.D. Gibson, B. Koo, S.D. Gray: Filtering of colored noise for speech enhancement and coding, IEEE Trans. Signal Process. 39, 1732-1742 (1991)

    Article  Google Scholar 

  33. S. Gannot, D. Burshtein, E. Weinstein: Iterative and sequential Kalman filter-based speech enhancement algorithms, IEEE Trans. Speech Audio Process. 6, 373-385 (1998)

    Article  Google Scholar 

  34. Y. Ephraim, D. Malah, B.-H. Juang: On the application of hidden Markov models for enhancing noisy speech, IEEE Trans. Acoust. Speech Signal Process. 37, 1846-1856 (1989)

    Article  Google Scholar 

  35. Y. Ephraim: A Bayesian estimation approach for speech enhancement using hidden Markov models, IEEE Trans. Signal Process. 40, 725-735 (1992)

    Article  Google Scholar 

  36. Y. Ephraim: Statstical-model-based speech enhancement systems, Proc. IEEE 80, 1526-1555 (1992)

    Article  Google Scholar 

  37. D. Klatt: Review of test-to-speech conversion for English, J. Acoust. Soc. Am. 82, 737-793 (1987)

    Article  Google Scholar 

  38. U. Jekosch: Speech quality assessment and evaluation, Proc. Eurospeech 1993, 1387-1394 (1993)

    Google Scholar 

  39. C. Delogu, P. Paoloni, P. Pocci, C. Sementina: Quality evaluation of text-to-speech synthesizers using magnitude estimation, categorical estimation, pair comparison and reaction time methods, Proc. Eurospeech 1991, 353-356 (1991)

    Google Scholar 

  40. S.R. Quackenbush, T.P. Barnwell, M.A. Clements: Objective Measures of Speech Quality (Prentice Hall, Englewood Cliffs 1988)

    Google Scholar 

  41. L.R. Rabiner, B.H. Juang: Fundamentals of Speech Recognition (Prentice Hall, Englewood Cliffs 1993)

    MATH  Google Scholar 

  42. D. Mansour, B.H. Juang: A family of distortion meansures based upon projection operation for robust speech recognition, IEEE Trans. Acoust. Speech Signal Process. 37, 1659-1671 (1989)

    Article  Google Scholar 

  43. F. Itakura, S. Saito: A statistical method for estimation of speech spectral density and formant frequencies, Electron. Commun. Jpn. 53A, 36-43 (1970)

    Google Scholar 

  44. G. Chen, S.N. Koh, I.Y. Soon: Enhanced Itakura measure incorporating masking properties of human auditory system, Signal Process. 83, 1445-1456 (2003)

    Article  MATH  Google Scholar 

  45. K. Fukunaga: Introduction to Statistial Pattern Recognition (Academic, San Diego 1990)

    MATH  Google Scholar 

  46. N. Wiener: Extrapolation, Interpolation, and Smoothing of Stationary Time Series (Wiley, New York 1949)

    MATH  Google Scholar 

  47. H.L. Van Trees: Dection, Estimation, and Modulation Theory, Part I (Wiley, New York 1968)

    MATH  Google Scholar 

  48. R. Martin: Speech enhancement using MMSE short time spectral estimation with Gamma distributed speech priors, Proc. IEEE ICASSP 2002, I253-I256 (2002)

    Google Scholar 

  49. I.S. Gradshteyn, I.M. Ryzhik, A. Jeffery, D. Zwillinger (Eds.): Table of Integrals, Series, and Products (Academic, San Diego 2000)

    MATH  Google Scholar 

  50. C. Breithaupt, R. Martin: MMSE estimation fo magnitude-square DFT coefficients with supergaussian priors, Proc. IEEE ICASSP 2003, I848-I851 (2003)

    Google Scholar 

  51. I. Cohen: Speech enhancement using supergaussian speech models and noncausal a priori SNR estimation, Speech Commun. 47, 336-350 (2005)

    Article  Google Scholar 

  52. S.O. Rice: Stasitical properties of a sinewave plus random noise, Bell System Tech. J. 0, 109-157 (1948)

    Article  Google Scholar 

  53. D. Middleton, R. Esposito: Simultaneous optimum detection and estimation of signals in noise, IEEE Trans. Inform. Theory IT-14, 434-444 (1968)

    Article  MATH  Google Scholar 

  54. D.L. Wang, J.S. Lim: The unimportance of phase in speech enhancement, IEEE Trans. Acoust. Speech Signal Process. ASSP-30, 679-681 (1982)

    Article  Google Scholar 

  55. H. Dudley, T.H. Tarnoczy: The speaking machine of Wolfgang von Kempelen, J. Acoust. Soc. Am. 22, 151-166 (1950)

    Article  Google Scholar 

  56. Sir R. Paget: Human Speech (Harcourt, London, New York 1930)

    Google Scholar 

  57. J.Q. Stewart: An electrical analogue of the vocal cords, Nature 110, 311-312 (1922)

    Article  Google Scholar 

  58. H.K. Dunn: The calculation of vowel resonances, and an electrical vocal tract, J. Acoust. Soc. Am. 22, 740-753 (1950)

    Article  Google Scholar 

  59. B.S. Atal, L.S. Hanauer: Speech analysis and synthesis by linear prediction of the speech wave, J. Acoust. Soc. Am. 50, 637-655 (1971)

    Article  Google Scholar 

  60. F. Itakura: Minimum prediction residual principle applied to speech recognition, IEEE Trans. Acoust. Speech Signal Process. ASSP-23, 67-72 (1975)

    Article  Google Scholar 

  61. T.W. Parsons: Separation of speech from interfering speech by means of harmonic selection, J. Acoust. Soc. Am. 60, 911-918 (1976)

    Article  Google Scholar 

  62. R.H. Frazier, S. Samsam, L.D. Braida, A.V. Oppenheim: Enhancement of speech by adaptive filtering, Proc. IEEE ICASSP 1976, 251-253 (1976)

    Google Scholar 

  63. R.J. McAulay, T.F. Quatieri: Mid-rate coding based on sinusoidal representation of speech, Proc IEEE ICASSP 1985, 945-948 (1985)

    Google Scholar 

  64. D.P. Morgan, E.B. George, L.T. Lee, S.M. Kay: Cochannel speaker separation by harmonic enhancement and suprresion, IEEE Trans. Speech Audio Process. 5, 405-424 (1997)

    Article  Google Scholar 

  65. E.B. George, M.J.T. Smith: Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model, IEEE Trans. Speech Audio Process. 5, 389-406 (1997)

    Article  Google Scholar 

  66. D. OʼBrien, A.I.C. Monaghan: Concatenative synthesis based on a harmonic model, IEEE Trans. Speech Audio Process. 9, 11-20 (2001)

    Article  Google Scholar 

  67. J.S. Lim, A.V. Oppenheim, L.D. Braida: Evaluation of an adpative comb filtering method for enhancing speech degraded by white noise addition, IEEE Trans. Acoust. Speech Signal Process. ASSP-26, 354-358 (1978)

    Article  Google Scholar 

  68. J. Makhoul: Linear prediction: A tutorial review, Proc. IEEE 63, 561-580 (1975)

    Article  Google Scholar 

  69. J.R. Deller, J.G. Proakis, J.H.L. Hansen: Discrete-Time Processing of Speech Signals (Macmillan, New York 1993)

    Google Scholar 

  70. K.M. Malladi, R.V. Rajakumar: Estimation of time-varying AR models of speech through Gauss-Markov modeling, Proc. IEEE ICASSP 6, 305-308 (2003)

    Google Scholar 

  71. M. Niedźwiecki, K. Cisowski: Adaptive scheme for elimination fo broadband noise and impulsive disturbance from AR and ARMA signals, IEEE Trans. Signal Process. 44, 528-537 (1996)

    Article  Google Scholar 

  72. B. Koo, J.D. Gibson: Filtering of colored noise for speech enhancement and coding, Proc. IEEE ICASSP 1989, 345-352 (1989)

    Google Scholar 

  73. B. Lee, K.Y. Lee, S. Ann: An EM-based approach for parameter enhancement with an application to speech signals, Signal Process. 46, 1-14 (1995)

    Article  MATH  Google Scholar 

  74. Z. Goh, K.C. Tan, B.T.G. Tan: Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model, IEEE Trans. Speech Audio Process. 7, 510-524 (1999)

    Article  Google Scholar 

  75. C. Li, S.V. Andersen: Intergrating Kalman filtering and multi-pulse coding for speech enhancement with a non-stationary model of the speech signal, Proc. IEEE ICASSP 2004, 2300-2304 (2004)

    Google Scholar 

  76. L.R. Rabiner: A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE 77, 257-286 (1989)

    Article  Google Scholar 

  77. B.H. Juang, L.R. Rabiner: Mixture autoregressive hidden Markov models for speech signals, IEEE Trans. Acoust. Speech Signal Process. ASSP-33, 1404-1413 (1985)

    Article  Google Scholar 

  78. A.P. Dempster, N.M. Laird, D.B. Rubin: Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Stat. Soc. B 39, 1-38 (1977)

    MathSciNet  MATH  Google Scholar 

  79. H. Sameti, H. Sheikhzadeh, L. Deng, R.L. Brennan: HMM-based strategies for enhancement of speech signals embedded in nonstationary noise, IEEE Trans. Speech Audio Process. 6, 445-455 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jingdong Chen Dr. , Jacob Benesty Prof. , Yiteng (Arden) Huang Dr. or Eric J. Diethorn Dr. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Chen, J., Benesty, J., Huang, Y.(., Diethorn, E.J. (2008). Fundamentals of Noise Reduction. In: Benesty, J., Sondhi, M.M., Huang, Y.A. (eds) Springer Handbook of Speech Processing. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49127-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-49127-9_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49125-5

  • Online ISBN: 978-3-540-49127-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics