Skip to main content
Log in

Weighted Sigmoid-Based Frequency-Selective Noise Filtering for Speech Denoising

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Estimation of noise often has a major impact on the quality of enhanced signal, especially when it comes in speech enhancement applications. The non-stationary noise statistics vary with time, making decision of speech active/inactive frame is however difficult. Further, since there is no prior information of noise distribution, the estimators use the recursive averaging with a fixed smoothing coefficient ranging from 0.70 to 0.99. This fixed smoothing coefficient actually correlates the previous frames of noise statistics. Unfortunately, using fixed smoothing coefficient, the estimator treats both speech active/inactive frames equally which may cause the leakage of speech/noise power and results in loss of speech intelligibility. To address this problem and to increase the noise estimation accuracy, this paper proposes a posteriori SNR and frequency dependent adaptive smoothing coefficient. Further, this paper investigates the performance of proposed weighted sigmoid function (WSIG) noise estimator. From both objective and subjective quality assessments, it is clearly evident that the proposed noise estimator yields considerably better tracking of noise spectral variations compared to the existing state of the art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Y.D. Cho, K. Naimi, A. Kondoz, Improved statistical voice activity detection based on a smoothed statistical likelihood ratio, in IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 737–740 (2001)

  2. I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process. 11(5), 466–475 (2003)

    Article  Google Scholar 

  3. I. Cohen, B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Process. Lett. 9(1), 12–15 (2002)

    Article  Google Scholar 

  4. A. Davis, S. Nordholm, R. Togneri, Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. IEEE Trans. Audio Speech Lang. Process. 14(2), 412–424 (2006)

    Article  Google Scholar 

  5. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 32(6), 1109–1121 (1984)

    Article  Google Scholar 

  6. J.S. Garofolo, DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CD-ROM. National Institute of Standards and Technology (NIST), pp. 1–78 (1990)

  7. T. Gerkmann, R.C. Hendriks, Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Trans. Audio Speech Lang. Process. 20(4), 1383–1393 (2011)

    Article  Google Scholar 

  8. J.L. Hansen, B.L. Pellom, An effective quality evaluation protocol for speech enhancement algorithms, in Proceedings of the International Conference on Speech and Language Processing, pp. 2819–2822 (1998)

  9. R.C. Hendriks, R. Heusdens, J. Jensen, MMSE based noise PSD tracking with low complexity, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4266–4269 (2010)

  10. ITU-P.862, Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs (2000)

  11. S. Liu, Y. Ma, Y. Huang, Sea clutter cancellation for passive radar sensor exploiting multi-channel adaptive filters. IEEE Sens. J. 19(3), 982–995 (2019)

    Article  Google Scholar 

  12. P.C. Loizou, G. Kim, Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions. IEEE Trans. Audio Speech Lang. Process. 19(1), 47–56 (2011)

    Article  Google Scholar 

  13. T. Lotter, P. Vary, Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model. EURASIP J. Appl. Sig. Process. 7, 1110–1126 (2005)

    MATH  Google Scholar 

  14. R. Martin, Spectral subtraction based on minimum statistics, in European Signal Processing Conference, pp. 1182–1185 (1994)

  15. R. Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 9(5), 504–512 (2001)

    Article  Google Scholar 

  16. R. Martin, Speech enhancement using MMSE short time spectral estimation with Gamma distributed speech prior. IEEE Int. Conf. Acoust. Speech Signal Process. 1, 253–256 (2002)

    Google Scholar 

  17. R.J. McAulay, M.L. Malpass, Speech enhancement using a soft-decision noise suppression filter. IEEE Trans. Acoust. Speech Signal Process. 28(2), 137–145 (1980)

    Article  Google Scholar 

  18. K. Paliwal, W. Kamil, B. Schwerin, Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Commun. 52(5), 450–475 (2010)

    Article  Google Scholar 

  19. K. Paliwal, B. Schwerin, W. Kamil, Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator. Speech Commun. 54(2), 282–305 (2012)

    Article  Google Scholar 

  20. S. Pascual, A. Bonafonte, J. Serra, SEGAN: speech enhancement generative adversarial network. Universitat Poilte’cnica de Catalunva, Barcelona, Spain (2017)

  21. E. Plourde, B. Champagne, Auditory-based spectral amplitude estimators for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(8), 1614–1623 (2008)

    Article  Google Scholar 

  22. T.F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice (Prentice Hall, Upper Saddle River, 2002)

    Google Scholar 

  23. D. Rethage, J. Pons, X. Serra, A wavelet for speech denoising (2017)

  24. Y. Rongshan, A low-complexity noise estimation algorithm based on smoothing of noise power estimation and estimation bias correction, in IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4421–4424 (2009)

  25. M.K. Singh, Methods for Speech Intelligibility Enhancement. Ph.D. thesis, Curtin University, Perth Australia, Electrical Engineering and Computing. http://hdl.handle.net/20.500.11937/57107 (2017)

  26. M.K. Singh, S.Y. Low, S. Nordholm, Z. Zhuquan, Bayesian noise estimation in the modulation domain. Speech Commun. 96, 81–92 (2018)

    Article  Google Scholar 

  27. J. Sohn, N. Kim, Statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)

    Article  Google Scholar 

  28. C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, An algorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136 (2011)

    Article  Google Scholar 

  29. D. Wang, J. Chen, Supervised speech separation based on deep learning: an overview. IEEE/ACM Trans. Audio Speech Lang. Process. 26(10), 1702–1726 (2018)

    Article  Google Scholar 

  30. P.C. Yong, S. Nordholm, H. Dam, Noise estimation based on soft decisions and conditional smoothing for speech enhancement, in International Workshop on Acoustic Signal Enhancement, pp. 1–4 (2012)

  31. P.C. Yong, S. Nordholm, H. Dam, Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement. Speech Commun. 55(2), 358–376 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Sh. Naresh Kumar, Director, Snow & Avalanche Study Establishment (SASE), Defence Research and Development Organization (DRDO), Government of India, for perusing this research. Further the authors would like to thank the human participants for conducting the listening test for subjective quality assessment.

Funding

The co-author of this paper, Dr. S.Y. Low is supported by the Fundamental Research Grants Scheme (FRGS/1/2019/TK04/USMC/02/1) under the Ministry of Higher Education, Malaysia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neeraj Sharma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, N., Singh, M.K., Low, S.Y. et al. Weighted Sigmoid-Based Frequency-Selective Noise Filtering for Speech Denoising. Circuits Syst Signal Process 40, 276–295 (2021). https://doi.org/10.1007/s00034-020-01469-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-020-01469-9

Keywords

Navigation