Fundamentals of Noise Reduction

Chen, Jingdong; Benesty, Jacob; Huang, Yiteng (Arden); Diethorn, Eric J.

doi:10.1007/978-3-540-49127-9_43

Jingdong Chen Dr.⁴,
Jacob Benesty Prof.⁵,
Yiteng (Arden) Huang Dr.⁶ &
…
Eric J. Diethorn Dr.⁷

Part of the book series: Springer Handbooks ((SHB))

8531 Accesses
27 Citations

Abstract

The existence of noise is inevitable. In all applications that are related to voice and speech, from sound recording, telecommunications, and telecollaborations, to human-machine interfaces, the signal of interest that is picked up by a microphone is generally contaminated by noise. As a result, the microphone signal has to be cleaned up with digital signal-processing tools before it is stored, analyzed, transmitted, or played out. The cleaning process, which is often referred to as either noise reduction or speech enhancement, has attracted a considerable amount of research and engineering attention for several decades. Remarkable advances have already been made, and this area is continuing to progress, with the aim of creating processors that can extract the desired speech signal as if there is no noise. This chapter presents a methodical overview of the state of the art of noise-reduction algorithms. Based on their theoretical origin, the algorithms are categorized into three fundamental classes: filtering techniques, spectral restoration, and model-based methods. We outline the basic ideas underlying these approaches, discuss their characteristics, explain their intrinsic relationships, and review their advantages and disadvantages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 579.00; Price excludes VAT (USA)

Hardcover Book: USD 729.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

AR:: autoregressive
CE:: categorical estimation
CRLB:: Cramèr-Rao lower bound
DFT:: discrete Fourier transform
EM:: estimate-maximize
FFT:: fast Fourier transform
FIR:: finite impulse response
HMM:: hidden Markov models
HNM:: harmonic-plus-noise model
IDFT:: inverse DFT
LP:: linear prediction
LPC:: linear predictive coding
MAP:: maximum a posteriori
ML:: maximum-likelihood
MMSE:: minimum mean-square error
MOS:: mean opinion score
MSE:: mean-square error
PSD:: power spectral density
SNR:: signal-to-noise ratio
STFT:: short-time Fourier transform

References

J. Benesty, S. Makino, J. Chen (Eds.): Speech Enhancement (Springer, Berlin, Heidelberg 2005)
Google Scholar
D.H. Johnson, D.E. Dudgeon: Array Signal Processing: Concepts and Techniques (Prentice Hall, Upper Saddle River 1993)
MATH Google Scholar
M. Brandstein, D. Ward (Eds.): Microphone Arrays: Signal Processing Techniques and Applications (Springer, Berlin, Heidelberg 2001)
Google Scholar
Y. Huang, J. Benesty (Eds.): Audio Signal Processing for Next-Generation Multimedia Communication Systems (Kluwer Academic, Boston 2004)
Google Scholar
B. Widrow, J.R. Glover, J.M. McCool, J. Kaunitz, C.S. Williams, R.H. Hearn, J.R. Zeidler, E. Dong, R.C. Goodwin: Adaptive noise canceling: principles and applications, Proc. IEEE 63, 1692-1716 (1975)
Article Google Scholar
B. Widrow, S.D. Stearns: Adaptive Signal Processing (Prentice Hall, Englewood Cliffs 1985)
MATH Google Scholar
M.M. Goulding, J.S. Bird: Speech enhancement for mobile telephony, IEEE Trans. Veh. Technol. 39, 316-326 (1990)
Article Google Scholar
H.J. Kushner: On closed-loop adaptive noise cancellation, IEEE Trans. Automat. Contr. 43, 1103-1107 (1998)
Article MathSciNet MATH Google Scholar
A.S. Abutaled: An adaptive filter for noise canceling, IEEE Trans. Circuits Syst. 35, 1201-1209 (1998)
Article Google Scholar
M. R. Schroeder: U.S. Patent No. 3180936, filed Dec. 1, 1960, issued Apr. 27, 1965
Google Scholar
M. R. Schroeder: U.S. Patent No. 3403224, filed May 28, 1965, issued Sept. 24, 1968
Google Scholar
S.F. Boll: Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust. Speech Signal Process. ASSP-27, 113-120 (1979)
Article Google Scholar
J.S. Lim, A.V. Oppenheim: Enhancement and bandwidth compression of noisy speech, Proc. IEEE 67, 1586-1604 (1979)
Article Google Scholar
J.S. Lim (Ed.): Speech Enhancement (Prentice Hall, Englewood Cliffs 1983)
Google Scholar
P. Vary: Noise suppression by spectral magnitude estimation-mechanism and theoretical limits, Signal Process. 8, 387-400 (1985)
Article Google Scholar
R. Martin: Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio Process. 9, 504-512 (2001)
Article Google Scholar
W. Etter, G.S. Moschytz: Noise reduction by noise-adaptive spectral magnitude expansion, J. Audio Eng. Soc. 42, 341-349 (1994)
Google Scholar
J. Chen, J. Benesty, Y. Huang, S. Doclo: New insights into the noise reduction Wiener filter, IEEE Trans. Speech Audio Process. 14, 1218-1234 (2006)
Article Google Scholar
Y. Ephraim, H.L. Van Trees: A signal subspace approach for speech enhancement, IEEE Trans. Speech Audio Process. 3, 251-266 (1995)
Article Google Scholar
M. Dendrinos, S. Bakamidis, G. Garayannis: Speech enhancement from noise: A regenerative approach, Speech Commun. 10, 45-57 (1991)
Article Google Scholar
P.S.K. Hansen: Signal Subspace Methods for Speech Enhancement, Ph.D. Dissertation (Tech. Univ. Denmark, Lyngby 1997)
Google Scholar
S.H. Jensen, P.C. Hansen, S.D. Hansen, J.A. Sørensen: Reduction of broad-band noise in speech by truncated QSVD, IEEE Trans. Speech Audio Process. 3, 439-448 (1995)
Article MATH Google Scholar
H. Lev-Ari, Y. Ephraim: Extension of the signal subspace speech enhancement approach to colored noise, IEEE Trans. Speech Audio Process. 10, 104-106 (2003)
Google Scholar
A. Rezayee, S. Gazor: An adaptive KLT approach for speech enhancement, IEEE Trans. Speech Audio Process. 9, 87-95 (2001)
Article Google Scholar
U. Mittal, N. Phamdo: Signal/noise KLT based approach for enhancing speech degraded by colored noise, IEEE Trans. Speech Audio Process. 8, 159-167 (2000)
Article Google Scholar
Y. Hu, P.C. Loizou: A generalized subspace approach for enhancing spech corrupted by colored noise, IEEE Trans. Speech Audio Process. 11, 334-341 (2003)
Article Google Scholar
Y. Ephraim, D. Malah: Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process. 32, 1109-1121 (1984)
Article Google Scholar
Y. Ephraim, D. Malah: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Trans. Acoust. Speech Signal Process. ASSP-33, 443-445 (1985)
Article Google Scholar
R.J. McAulay, M.L. Malpass: Speech enhancement using a soft-decision noise suppression filter, IEEE Trans. Acoust. Speech Signal Process. 28, 137-145 (1980)
Article Google Scholar
P.J. Wolfe, S.J. Godsill: Simple alternatives to the Ephraim and Malah suppression rule for speech ehancemnet, Proc. IEEE ICASSP 2001, 496-499 (2001)
Google Scholar
K.K. Paliwal, A. Basu: A speech enhancement method based on Kalman filtering, Proc. IEEE ICASSP 1987, 177-180 (1987)
Google Scholar
J.D. Gibson, B. Koo, S.D. Gray: Filtering of colored noise for speech enhancement and coding, IEEE Trans. Signal Process. 39, 1732-1742 (1991)
Article Google Scholar
S. Gannot, D. Burshtein, E. Weinstein: Iterative and sequential Kalman filter-based speech enhancement algorithms, IEEE Trans. Speech Audio Process. 6, 373-385 (1998)
Article Google Scholar
Y. Ephraim, D. Malah, B.-H. Juang: On the application of hidden Markov models for enhancing noisy speech, IEEE Trans. Acoust. Speech Signal Process. 37, 1846-1856 (1989)
Article Google Scholar
Y. Ephraim: A Bayesian estimation approach for speech enhancement using hidden Markov models, IEEE Trans. Signal Process. 40, 725-735 (1992)
Article Google Scholar
Y. Ephraim: Statstical-model-based speech enhancement systems, Proc. IEEE 80, 1526-1555 (1992)
Article Google Scholar
D. Klatt: Review of test-to-speech conversion for English, J. Acoust. Soc. Am. 82, 737-793 (1987)
Article Google Scholar
U. Jekosch: Speech quality assessment and evaluation, Proc. Eurospeech 1993, 1387-1394 (1993)
Google Scholar
C. Delogu, P. Paoloni, P. Pocci, C. Sementina: Quality evaluation of text-to-speech synthesizers using magnitude estimation, categorical estimation, pair comparison and reaction time methods, Proc. Eurospeech 1991, 353-356 (1991)
Google Scholar
S.R. Quackenbush, T.P. Barnwell, M.A. Clements: Objective Measures of Speech Quality (Prentice Hall, Englewood Cliffs 1988)
Google Scholar
L.R. Rabiner, B.H. Juang: Fundamentals of Speech Recognition (Prentice Hall, Englewood Cliffs 1993)
MATH Google Scholar
D. Mansour, B.H. Juang: A family of distortion meansures based upon projection operation for robust speech recognition, IEEE Trans. Acoust. Speech Signal Process. 37, 1659-1671 (1989)
Article Google Scholar
F. Itakura, S. Saito: A statistical method for estimation of speech spectral density and formant frequencies, Electron. Commun. Jpn. 53A, 36-43 (1970)
Google Scholar
G. Chen, S.N. Koh, I.Y. Soon: Enhanced Itakura measure incorporating masking properties of human auditory system, Signal Process. 83, 1445-1456 (2003)
Article MATH Google Scholar
K. Fukunaga: Introduction to Statistial Pattern Recognition (Academic, San Diego 1990)
MATH Google Scholar
N. Wiener: Extrapolation, Interpolation, and Smoothing of Stationary Time Series (Wiley, New York 1949)
MATH Google Scholar
H.L. Van Trees: Dection, Estimation, and Modulation Theory, Part I (Wiley, New York 1968)
MATH Google Scholar
R. Martin: Speech enhancement using MMSE short time spectral estimation with Gamma distributed speech priors, Proc. IEEE ICASSP 2002, I253-I256 (2002)
Google Scholar
I.S. Gradshteyn, I.M. Ryzhik, A. Jeffery, D. Zwillinger (Eds.): Table of Integrals, Series, and Products (Academic, San Diego 2000)
MATH Google Scholar
C. Breithaupt, R. Martin: MMSE estimation fo magnitude-square DFT coefficients with supergaussian priors, Proc. IEEE ICASSP 2003, I848-I851 (2003)
Google Scholar
I. Cohen: Speech enhancement using supergaussian speech models and noncausal a priori SNR estimation, Speech Commun. 47, 336-350 (2005)
Article Google Scholar
S.O. Rice: Stasitical properties of a sinewave plus random noise, Bell System Tech. J. 0, 109-157 (1948)
Article Google Scholar
D. Middleton, R. Esposito: Simultaneous optimum detection and estimation of signals in noise, IEEE Trans. Inform. Theory IT-14, 434-444 (1968)
Article MATH Google Scholar
D.L. Wang, J.S. Lim: The unimportance of phase in speech enhancement, IEEE Trans. Acoust. Speech Signal Process. ASSP-30, 679-681 (1982)
Article Google Scholar
H. Dudley, T.H. Tarnoczy: The speaking machine of Wolfgang von Kempelen, J. Acoust. Soc. Am. 22, 151-166 (1950)
Article Google Scholar
Sir R. Paget: Human Speech (Harcourt, London, New York 1930)
Google Scholar
J.Q. Stewart: An electrical analogue of the vocal cords, Nature 110, 311-312 (1922)
Article Google Scholar
H.K. Dunn: The calculation of vowel resonances, and an electrical vocal tract, J. Acoust. Soc. Am. 22, 740-753 (1950)
Article Google Scholar
B.S. Atal, L.S. Hanauer: Speech analysis and synthesis by linear prediction of the speech wave, J. Acoust. Soc. Am. 50, 637-655 (1971)
Article Google Scholar
F. Itakura: Minimum prediction residual principle applied to speech recognition, IEEE Trans. Acoust. Speech Signal Process. ASSP-23, 67-72 (1975)
Article Google Scholar
T.W. Parsons: Separation of speech from interfering speech by means of harmonic selection, J. Acoust. Soc. Am. 60, 911-918 (1976)
Article Google Scholar
R.H. Frazier, S. Samsam, L.D. Braida, A.V. Oppenheim: Enhancement of speech by adaptive filtering, Proc. IEEE ICASSP 1976, 251-253 (1976)
Google Scholar
R.J. McAulay, T.F. Quatieri: Mid-rate coding based on sinusoidal representation of speech, Proc IEEE ICASSP 1985, 945-948 (1985)
Google Scholar
D.P. Morgan, E.B. George, L.T. Lee, S.M. Kay: Cochannel speaker separation by harmonic enhancement and suprresion, IEEE Trans. Speech Audio Process. 5, 405-424 (1997)
Article Google Scholar
E.B. George, M.J.T. Smith: Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model, IEEE Trans. Speech Audio Process. 5, 389-406 (1997)
Article Google Scholar
D. OʼBrien, A.I.C. Monaghan: Concatenative synthesis based on a harmonic model, IEEE Trans. Speech Audio Process. 9, 11-20 (2001)
Article Google Scholar
J.S. Lim, A.V. Oppenheim, L.D. Braida: Evaluation of an adpative comb filtering method for enhancing speech degraded by white noise addition, IEEE Trans. Acoust. Speech Signal Process. ASSP-26, 354-358 (1978)
Article Google Scholar
J. Makhoul: Linear prediction: A tutorial review, Proc. IEEE 63, 561-580 (1975)
Article Google Scholar
J.R. Deller, J.G. Proakis, J.H.L. Hansen: Discrete-Time Processing of Speech Signals (Macmillan, New York 1993)
Google Scholar
K.M. Malladi, R.V. Rajakumar: Estimation of time-varying AR models of speech through Gauss-Markov modeling, Proc. IEEE ICASSP 6, 305-308 (2003)
Google Scholar
M. Niedźwiecki, K. Cisowski: Adaptive scheme for elimination fo broadband noise and impulsive disturbance from AR and ARMA signals, IEEE Trans. Signal Process. 44, 528-537 (1996)
Article Google Scholar
B. Koo, J.D. Gibson: Filtering of colored noise for speech enhancement and coding, Proc. IEEE ICASSP 1989, 345-352 (1989)
Google Scholar
B. Lee, K.Y. Lee, S. Ann: An EM-based approach for parameter enhancement with an application to speech signals, Signal Process. 46, 1-14 (1995)
Article MATH Google Scholar
Z. Goh, K.C. Tan, B.T.G. Tan: Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model, IEEE Trans. Speech Audio Process. 7, 510-524 (1999)
Article Google Scholar
C. Li, S.V. Andersen: Intergrating Kalman filtering and multi-pulse coding for speech enhancement with a non-stationary model of the speech signal, Proc. IEEE ICASSP 2004, 2300-2304 (2004)
Google Scholar
L.R. Rabiner: A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE 77, 257-286 (1989)
Article Google Scholar
B.H. Juang, L.R. Rabiner: Mixture autoregressive hidden Markov models for speech signals, IEEE Trans. Acoust. Speech Signal Process. ASSP-33, 1404-1413 (1985)
Article Google Scholar
A.P. Dempster, N.M. Laird, D.B. Rubin: Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Stat. Soc. B 39, 1-38 (1977)
MathSciNet MATH Google Scholar
H. Sameti, H. Sheikhzadeh, L. Deng, R.L. Brennan: HMM-based strategies for enhancement of speech signals embedded in nonstationary noise, IEEE Trans. Speech Audio Process. 6, 445-455 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Alcatel-Lucent, Bell Laboratories, 600 Mountain Ave, 07974, Murray Hill, NJ, USA
Jingdong Chen Dr.
INRS-EMT, University of Quebec, 800 de la Gauchetiere Ouest, H5A 1K6, Montreal, Quebec, Canada
Jacob Benesty Prof.
Alcatel-Lucent, Bell Laboratories, 600 Mountain Avenue, 07974, Murray Hill, NJ, USA
Yiteng (Arden) Huang Dr.
Multimedia Technologies Research Department, Avaya Labs Research, 233 Mt. Airy Road, 07920, Basking Ridge, NJ, USA
Eric J. Diethorn Dr.

Authors

Jingdong Chen Dr.
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Benesty Prof.
View author publications
You can also search for this author in PubMed Google Scholar
Yiteng (Arden) Huang Dr.
View author publications
You can also search for this author in PubMed Google Scholar
Eric J. Diethorn Dr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jingdong Chen Dr. , Jacob Benesty Prof. , Yiteng (Arden) Huang Dr. or Eric J. Diethorn Dr. .

Editor information

Editors and Affiliations

INRS-EMT, University of Quebec, 800 de la Gauchetiere Ouest, H5A 1K6, Montreal, Quebec, Canada
Jacob Benesty Dr.
Avayalabs Research, 233 Mount Airy Road, 07920, Basking Ridge, NJ, USA
M. Mohan Sondhi Ph.D.
Alcatel-Lucent, Bell Laboratories, 600 Mountain Avenue, 07974, Murray Hill, NJ, USA
Yiteng Arden Huang Dr.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chen, J., Benesty, J., Huang, Y.(., Diethorn, E.J. (2008). Fundamentals of Noise Reduction. In: Benesty, J., Sondhi, M.M., Huang, Y.A. (eds) Springer Handbook of Speech Processing. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49127-9_43

Download citation

DOI: https://doi.org/10.1007/978-3-540-49127-9_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49125-5
Online ISBN: 978-3-540-49127-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics