Skip to main content
Log in

Noisy speech enhancement based on correlation canceling/log-MMSE hybrid method

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, a speech enhancement method based on correlation canceling approach associated with the Log- minimum mean-square-error estimator is presented. Unlike the conventional statistical-model methods based on the nonlinear estimation of the enhanced speech signal, such as Maximum-Likelihood estimator (ML), Maximum A Posteriori (MAP) estimator, Minimum Mean Square Error (MMSE) estimator and log MMSE estimator, in the proposed hybrid method (CC/Log-MMSE), the nonlinear estimation is transformed into a linear estimation by exploiting the orthogonal projection of clean signal into the noisy signal. Thus, the enhanced signal represents the best “copy,” or estimate, of clean signal that can be made on the basis of the noisy signal vector. This is also seen as a canceling of the component of the noisy vector residing in the noise subspace, which therefore leads to improve the intelligibility of the enhanced signal. Extensive simulations are carried out using speech files corrupted by different noises available in the NOIZEUS corpus, show that the proposed hybrid method CC/Log-MMSE consistently outperforms the baseline methods of speech enhancement at different levels of SNR in terms of objective and subjective measures, spectrogram analysis and the overall SNR improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig 3.
Fig. 4

Similar content being viewed by others

Data availability

The noisy speech dataset (NOIZEUS) and documentation related to this work can be downloaded from. https://ecs.utdallas.edu/loizou/speech/noizeus/

References

  1. Akbacak M, Hansen JHL (2007) Environmental Sniffing: Noise Knowledge Estimation For Robust Speech Systems. IEEE Trans on ASLP 15(2):465–477

    Google Scholar 

  2. Asbai N, Amrouche A (2017) Boosting scores fusion approach using front-end diversity and adaboost algorithm, for speaker verification. Comput Electr Eng 62:648–662

    Article  Google Scholar 

  3. Bahrami M, Seyedin S (2018) MMSE log-spectral amplitude estimation for single channel speech enhancement under speech presence uncertainty by Weibull speech priors. In: Electrical engineering (ICEE), Iranian conference. IEEE, pp 749–754

    Chapter  Google Scholar 

  4. Bbeach RE, Harris JT, Montgomery RC, et al.(2014) Voice and data wireless communications network and method. U.S. Patent No 8, pp.660–661.

  5. Cohen I (2003) Noise Spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans on SAP 11(5):466–475

    Google Scholar 

  6. Cohen I, Berdugo B (2001) Speech enhancement for non-stationary noise environments, Elsevier. Signal Process 81:2403–2418

    Article  MATH  Google Scholar 

  7. Cohen I, Berdugo B (2002) Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE SPL 9(1):12–15

    Google Scholar 

  8. de Reyna JA (2019) The value of an integral in Gradshteyn and Ryzhik’s table. The Ramanujan J 50(3):551–571

    Article  MATH  Google Scholar 

  9. Ephraim Y, Malah D (1985) Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process 23(2):443–445

    Article  Google Scholar 

  10. Hirsch HG, Pearce D (2000) The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In ASR2000-Automatic speech recognition: challenges for the new Millenium ISCA tutorial and research workshop (ITRW).

  11. Hu Y, Loizou PC (2006) Subjective comparison of speech enhancement algorithms, proceedings of IEEE international conference on acoustics, speech, and signal processing, vol I. Toulouse, France, pp 153–156

    Google Scholar 

  12. Hu Y, Loizou PC (2008) Evaluation of objective quality measures for speech enhancement. Audio, Speech, Language Proces, IEEE Trans on 16(1):229–238

    Article  Google Scholar 

  13. Hu Y, Loizou PC(n.d.) NOIZEUS: a noisy speech corpus for evaluation of speech enhancement algorithms, available at http://www.utdallas.edu/~loizou/speech/noizeus/

  14. ITU-T, (2001) P. 862: Perceptual evaluation of speech quality (pesq), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs . ITU-T Recommendation, 862.

  15. ITU-T, (2003). P. 835: Subjective test methodology for evaluating speech communication systems that include noise suppression algorithms. ITU-T Recommendation (ITU, Geneva), 835.

  16. Ju GH, Lee LS (2002) Speech enhancement based on generalized singular value decomposition approach. In Seventh International Conference on Spoken Language Processing.

  17. Junqua JC, Haton JP (2012) Robustness in automatic speech recognition: fundamentals and applications. Springer Science & Business Media

    Google Scholar 

  18. KATES JM (2008) Digital hearing aids. Plural publishing

    Google Scholar 

  19. Kenai O, Ouamour S, Guerti M, Asbai N (2019) A new architecture based VAD for speaker diarization/detection systems. Int J Speech Technol 22(3):827–840

    Article  Google Scholar 

  20. Lee GW, Kim HK (2020) Multi-task learning u-net for single-channel speech enhancement and mask-based voice activity detection. Appl Sci 10(9):3230

    Article  Google Scholar 

  21. Loizou PC (2013) Speech enhancement: theory and practice. CRC press

    Book  Google Scholar 

  22. Malah D, Cox RV, Accardi AJ (1999) Tracking Speech-Presence Uncertainty To Improve Speech Enhancement In Nonstationary Noise Environments. Proc IEEE ICASSP:789–792

  23. Martin R (2001) Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans on SAP 9(5):504–512

    Google Scholar 

  24. Martin VA, Pollack P (2005) Methods for speech SNR estimation: evaluation tool and analysis of VAD dependency. Radioengineering 14(1):6–11

    Google Scholar 

  25. Poularikas AD (2018) Handbook of formulas and tables for signal processing. CRC press, p 2018

    Book  Google Scholar 

  26. Rangachari S, Loizou PC (2006) A noise estimation algorithm for highly non-stationary environments, speech communication. Elsevier 28:220–231

    Google Scholar 

  27. Sharma RR, Pachori RB (2018) Eigenvalue decomposition of Hankel matrix-based time-frequency representation for complex signals. Circuits, Syst Signal Proces 37(8):3313–3329

    Article  MATH  Google Scholar 

  28. Sophocles JO (2018) Optimum signal processing, 2nd edition, New York, NY. p.392.

  29. Wang H, Ye Z, Chen J (2018) A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method. In: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC). IEEE, pp 1–9

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nassim Asbai.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asbai, N., Zitouni, S., Bounazou, H. et al. Noisy speech enhancement based on correlation canceling/log-MMSE hybrid method. Multimed Tools Appl 82, 5803–5821 (2023). https://doi.org/10.1007/s11042-022-13591-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13591-8

Keywords

Navigation