Speech Enhancement: Application of the Kalman Filter in the Estimate-Maximize (EM) Framework

Gannot, Sharon

doi:10.1007/3-540-27489-8_8

Speech Enhancement: Application of the Kalman Filter in the Estimate-Maximize (EM) Framework

Sharon Gannot⁴

Chapter

2493 Accesses
5 Citations
3 Altmetric

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

The application of the Kalman filter to the single-microphone speech enhancement task is presented in this chapter. Among numerous published algorithms, an important sub-group employs the estimate-maximize (EM) procedure to iteratively estimate the spectral parameters of the speech and noise signals. We elaborate on a specific member of this sub-group. In the E-step, the Kalman smoother is applied and in the M-step, a non-standard Yule-Walker equation set is solved. An approximated EM algorithm is derived by applying the gradient-descent method to the likelihood function. We obtain a sequential, computationally efficient, algorithm. It is then shown, that the sequential parameter estimation can be replaced by a Kalman filter to obtain a dual speech and parameters Kalman filter. A natural generalization to the dual scheme is an estimation scheme in which both speech and parameters are jointly estimated by applying a nonlinear extension to the Kalman filter, namely the unscented Kalman filter. Extensive experimental study, using real speech and noise signals is provided to compare the proposed methods with alternative speech enhancement algorithms. Kalman filter based algorithms are shown to maintain the natural speech quality. However, their noise reduction ability is limited.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

N. Wiener, The Extrapolation, Interpolation and Smoothing of Stationary Time Series. John Wiley & Sons, Inc., New York, N.Y., USA, 1949.
Google Scholar
R. E. Kalman, “A new approach to linear filtering and prediction problems,” Trans. of the ASME-Journal of Basic Engineering, 82 (Series D), pp. 35–45, 1960.
Google Scholar
A. P. Dempster, N. M. Laird, and D.B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Roy. Stat. Soc., Ser. 3g, pp. 1–38, 1977.
Google Scholar
J. S. Lim and A. V. Oppenheim, “All-pole modeling of degraded speech,” IEEE Trans. on Acoustic, Speech and Signal Processing, vol. 26, pp. 197–210, June 1978.
Article MATH Google Scholar
J. H. L. Hansen and M. A. Clements, “Constrained iterative speech enhancement with application to automatic speech recognition,” in Proc. IEEE ICASSP, 1988, pp. 561–564.
Google Scholar
J. H. L. Hansen and M. A. Clements, “Constrained iterative speech enhancement with application to speech recognition,” IEEE Trans. on Signal Processing, vol. 39, pp. 795–805, Apr. 1991.
Article Google Scholar
B. L. Pellom and J. H. L. Hansen, “An improved constrained iterative speech enhancement for colored noise environments,” IEEE Trans. on Speech and Audio Processing, vol. 6, pp. 573–579, Nov. 1998.
Article Google Scholar
E. Masgrau, J. Salavedra, A. Moreno, and A. Ardanuy, “Speech enhancement by adaptive Wiener filtering based on cumulant AR modeling,” in M. Grenie and J. C. Junqua, editors, Speech Processing in Adverse Conditions, pp. 143–146. 1992.
Google Scholar
K. K. Paliwal and A. Basu, “A Speech enhancement method based on Kalman filtering,” in Proc. IEEE ICASSP, 1987, pp. 177–180.
Google Scholar
B. Koo, J. D. Gibson, and S. D. Gray, “Filtering of colored noise for speech enhancement and coding,” in Proc. IEEE ICASSP, 1989, pp. 349–352.
Google Scholar
J. D. Gibson, B. Koo, and S. D. Gray, “Filtering of colored noise for speech enhancement and coding,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 39, pp. 1732–1742, Aug. 1991.
Google Scholar
M. Feder, A. V. Oppenheim, and E. Weinstein, “Methods for noise cancellation based on the EM algorithm,” in Proc. IEEE ICASSP, 1987, pp. 201–204.
Google Scholar
B. Widrow, J. R. Glover Jr., J. M. McCool, J. Kaunitz, C. S. Williams, R. H. Hearn, J. R. Zeider, E. Dong Jr., and R. C. Goodlin, “Adaptive noise cancelling: principals and applications,” Proceeding of the IEEE, vol. 63, 1692–1716, Dec. 1975.
Google Scholar
M. Feder, A. V. Oppenheim, and E. Weinstein, “Maximum likelihood noise cancellation using the EM algorithm,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 204–216, Feb. 1989.
Article Google Scholar
E. Weinstein, A. V. Oppenheim, and M. Feder, “Signal enhancement using single and multi-sensor measurements,” Technical Report no. 560, M.I.T, Cambridge, MA, Nov. 1990.
Google Scholar
E. Weinstein, A. V. Oppenheim, M. Feder, and J. R. Buck, “Iterative and sequential algorithms for multisensor signal enhancement,” IEEE Trans. Signal Processing, vol. 42, pp. 846–859, Apr. 1994.
Article Google Scholar
M. Feder, E. Weinstein, and A. V. Oppenheim, “A new class of sequential and adaptive algorithms with application to noise cancellation,” in Proc. IEEE ICASSP, 1988, pp. 557–560.
Google Scholar
A. V. Oppenheim, E. Weinstein, K. C. Zangi, M. Feder, and D. Gauger, “Single-sensor active noise cancellation,” IEEE Trans. Speech and Audio Processing, vol. 2, pp. 285–290, Apr. 1994.
Article Google Scholar
B.-G. Lee, K. Y. Lee, and S. Ann, “An EM-based approach for parameter enhancement with an application to speech signals,” Signal Processing, vol. 46, pp. 1–14, 1995.
Article MATH Google Scholar
K. Y. Lee, B.-G. Lee, and S. Ann, “Adaptive filtering for speech enhancement in colored noise,” IEEE Signal Processing Letters, vol. 4, pp. 277–279, Oct. 1997.
Article Google Scholar
Z. Goh, K.-C. Tan, and B. T. G. Tan, “Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model,” IEEE Trans. on Speech and Audio Processing, vol. 7, pp. 510–524, Sept. 1999.
Article Google Scholar
M. Gabrea, E. Grivel, and M. Najim, “A single microphone Kalman filter-based noise canceller,” IEEE Signal Processing Letters, vol. 6, pp. 55–57, Mar. 1999.
Article Google Scholar
S. Gannot, D. Burshtein, and E. Weinstein, “Iterative and sequential Kalman filter-based speech enhancement algorithms,” IEEE Trans. on Speech and Audio Processing, vol. 6, pp. 373–385, July 1998.
Article Google Scholar
M. Fujimoto and Y. Ariki, “Noisy speech recognition using noise reduction method based on Kalman filter,” in Proc. IEEE ICASSP, 2000, pp. 1727–1730.
Google Scholar
S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 27, pp. 113–120, Apr. 1979.
Article Google Scholar
K. Y. Lee, B.-G. Lee, I. Song, and S. Ann, “Robust estimation of AR parameters and its application for speech enhancement,” in Proc. IEEE ICASSP, 1992, pp. 309–312.
Google Scholar
N. Ma, M. Bouchard, and R. A. Goubran, “Perceptual Kalman filtering for speech enhancement in colored noise,” in Proc. IEEE ICASSP, vol. 1, 2004, pp. 717–720.
Google Scholar
X. Shen and L. Deng, “A dynamic system approach to speech enhancement using the H∞ filtering algorithm,” IEEE Trans. on Speech and Audio Processing, vol. 27, pp. 391–399, July 1999.
Article Google Scholar
E. A. Wan and A. T. Nelson, “Removal of noise from speech using the dual EKF algorithm,” in Proc. IEEE ICASSP, 1998.
Google Scholar
S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estimation,” Proceedings of the IEEE, vol. 92, pp. 401–422, Mar. 2004.
Article Google Scholar
E. A. Wan and R. van der Merwe, “The unscented Kalman filter for nonlinear estimation,” in Proc. IEEE Symposium on Adaptive Systems for Signal Processing, Communication and Control (AS-SPCC), 2000.
Google Scholar
S. Gannot and M. Moonen, “On the application of the unscented Kalman filter to speech processing,” in Proc. IWAENC, 2003, pp. 27–30.
Google Scholar
W. Fong and S. Godsill, “Monte Carlo smoothing with application to audio signal enhancement,” in Proc. IEEE SSP Workshop, 2001, pp. 18–210.
Google Scholar
Y. Ephraim, D. Malah, and B. H. Juang, “On the application of hidden Markov models for enhancing noisy speech,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1846–1856, 1989.
Article Google Scholar
Y. Ephraim, “A Bayesian estimation approach for speech enhancement using hidden Markov models,” IEEE Trans. Signal Processing, vol. 40, pp. 725–735, 1992.
Article Google Scholar
Y. Ephraim, “Speech enhancement using state dependent dynamical system model,” in Proc. IEEE ICASSP, 1992, pp. 289–292.
Google Scholar
K. Y. Lee and K. Shirai, “Efficient recursive estimation for speech enhancement in colored noise,” IEEE Signal Processing Letters, vol. 3, pp. 196–199, 1996.
Article Google Scholar
K. Y. Lee and S. Jung, “Time-domain approach using multiple Kalman filters and EM algorithm to speech enhancement with nonstationary noise,” IEEE Trans. Speech and Audio Proc., vol. 8, pp. 373–385, May 2000.
Google Scholar
J. B. Kim, K. Y. Lee, and C. W. Lee, “On the applications of the interacting multiple model algorithm for enhancing noisy speech,” IEEE Trans. Speech and Audio Processing, vol. 8, pp. 349–352, May 2000.
Article Google Scholar
K. Y. Lee, S. McLaughlin, and K. Shirai, “Speech enhancement based on extended Kalman filter and neural predictive hidden Markov model,” in Proc. IEEE Int. Workshop Neural Networks for Signal Processing, 1996, pp. 302–310.
Google Scholar
Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean square error log-spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 33, pp. 443–445, Apr. 1985.
Article Google Scholar
I. Cohen and B. Berdugo, “Speech enhancement for non-stationary noise environments Signal Processing, vol. 81, pp. 2403–2418, Oct. 2001.
Article MATH Google Scholar
D. Burshtein and S. Gannot, “Speech enhancement using a mixture-maximum model,” IEEE Trans. Speech and Audio Processing, vol. 10, pp. 341–351, Sept. 2002.
Article Google Scholar
D. Burshtein, “Joint modeling and maximum-likelihood estimation of pitch and linear prediction coefficient parameters,” J. Acoustic Society of America, vol. 3, pp. 1531–1537, Mar. 1992.
Article Google Scholar
R. H. Shumway and D. S. Stoffer, “An approach to time series smoothing and forecasting using the EM algorithm,” J. Time Series Anal., vol. 3, no. 7, pp. 253–264, 1982.
MATH Google Scholar
C. L. Nikias and A. P. Petropulu, Higher-Order Spectra Analysis. Pearson Education POD, 1st edition, 1993.
Google Scholar
K. K. Paliwal and M. M. Sondhi, “Recognition of noisy speech using cumulant based linear prediction analysis,” in Proc. IEEE ICASSP, 1991, pp. 429–432.
Google Scholar
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, N. L. Dahlgren, and V. Zue, “Acoustic-phonetic continuous speech corpus (timit),” CD-ROM, Oct. 1991.
Google Scholar
R. G. Leonard and G. Doddington, “A database for speaker independent digit recognition (tidigits),” CD-ROM, Oct. 1984.
Google Scholar
A. Varga and H. J. M. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: a database and an axperiment to study the effect of additive noise on speech recognition systems,” Speech Communication, vol. 12, pp. 247–251, July 1993.
Article Google Scholar
ANSI, “Specifications for octave-band and fractional-octave-band analog and digital filters,” S1.1-1986 (ASA 65-1986), 1993.
Google Scholar
S. R. Quackenbush, T. P. Barnwell, and M. A. Clements, Objective Measures of Speech Quality. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1988.
Google Scholar
S. Gannot, “Audio sample files,” http://www.biu.ac.il/~gannot, Oct. 2004.
Google Scholar
R. van der Merwe, “Recursive Bayesian estimation library (ReBEL),” http://cslu.ece.ogi.edu/mlsp/rebel/, 2002.
Google Scholar
R. A. Fisher, “Theory of statistical estimation,” Proc. of the Cambridge Philosophical Society, vol. 22, pp. 700–725, 1925.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Bar-Ilan University, Ramat-Gan, 52900, Israel
Sharon Gannot

Authors

Sharon Gannot
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gannot, S. (2005). Speech Enhancement: Application of the Kalman Filter in the Estimate-Maximize (EM) Framework. In: Speech Enhancement. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27489-8_8

Download citation

DOI: https://doi.org/10.1007/3-540-27489-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24039-6
Online ISBN: 978-3-540-27489-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics