Advertisement

International Journal of Speech Technology

, Volume 21, Issue 4, pp 809–823 | Cite as

A new efficient backward BSS crosstalk-resistant algorithm for automatic blind speech quality enhancement

  • Mohamed Djendi
  • Meriem Zoulikha
Article

Abstract

In last 10 years, several noise reduction (NR) algorithms have been proposed to be combined with the blind source separation techniques to separate speech and noise signals from blind noisy observations. More often, techniques use voice activity detector (VAD) systems for the optimal solution. In this paper, we propose a new backward blind source separation (BBSS) structure that uses the input correlation properties to provide: (i) high convergence rates and good tracking capabilities, since the acoustic environments imply long and time-variant noise paths, and (ii) low misalignment and robustness against different noise type variations and double-talk. The proposed algorithm has an automatic behavior to enhance noisy speech signals, and do not need any VAD systems to separate speech and noise signals. The obtained results in terms of several objective criteria show the good performance properties of the proposed algorithm in comparison with state-of-the-art algorithms.

Keywords

Speech enhancement Noise reduction Voice activity detector BSS Forward Backward 

References

  1. Al-Kindi, M. J., & Dunlop, J. (1989). Improved adaptive noise cancellation in the presence of signal leakage on the noise reference channel. Signal Processing, 17(3), 241–250.MathSciNetCrossRefGoogle Scholar
  2. Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., & Verikas, A. (2018). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307–1319.CrossRefGoogle Scholar
  3. Cho, E., Lee, B., & Schafer, R., Widrow, B. (2016). Single channel speech enhancement using outlier detection. Computer Science. https://arxiv.org/pdf/1605.01329.pdf
  4. Dey, N., Ashour, A. S. (2018a). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in electrical and computer engineering (pp. 49–52). Cham: Springer.Google Scholar
  5. Dey, N., & Ashour, A. S. (2018b). Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Speech Technology. Cham: Springer.Google Scholar
  6. Dey, N., & Ashour, A. S. (2018c). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Electrical and Computer Engineering (pp. 35–48). Cham: Springer.Google Scholar
  7. Djendi, M., Scalart, P., & Gilloire, A. (2006). Noise cancellation using two closely spaced microphones: Experimental study with a specific model and two adaptive algorithms. In Proceedings of ICASSP, Vol. 3, pp. 744–747.Google Scholar
  8. Djendi, M. Advanced techniques for two-microphone noise reduction in mobile communications, Ph.D. Dissertation (in French). University of Rennes 1. France 2010, n°19012010.Google Scholar
  9. Djendi, M., Scalart, P., & Gilloire, A. (2013). Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise. Speech Communication, 55(10), 975–987.CrossRefGoogle Scholar
  10. Djendi, M., Scalart, P., Gilloire, A. (2009). Comparative study of new blind source separation structures for two-channel acoustic noise cancellation. In Proceedings of the IEEE, Glasgow, Scotland, pp. 24–28.Google Scholar
  11. Djendi, M., & Zoulikha, M. (2014). New automatic forward and backward blind sources. Separation algorithms for noise reduction and speech enhancement. Computer and Electrical Engineering, 40, 2072–2088.CrossRefGoogle Scholar
  12. Fukuda, T., Ichikawa, O., & Nishimura, M. (2010). Long-term spectro-temporal and static harmonic features for voice activity detection. IEEE Journal on Selected Topics in Signal Processing, 4(5), 834–844.CrossRefGoogle Scholar
  13. Ghosh, P. K., & Tsiartas, A., Narayanan, S. (2011). Robust voice activity detection using long-term signal variability. IEEE Transactions on Audio, Speech, and Language Processing, 19(3), 600–613.CrossRefGoogle Scholar
  14. Ghribi, K., Djendi, M., & Berkani, D. (2016). A New wavelet-based forward BSS algorithm for acoustic noise reduction and speech quality enhancement. Applied Acoustics, 105, 55–66.CrossRefGoogle Scholar
  15. Górriz, J. M., Ramírez, J., Lang, E. W., Puntonet, C. G., & Turias, I. (2010). Improved likelihood ratio test based voice activity detector applied to speech recognition. Speech Communication, 52(7–8), 664–677.CrossRefGoogle Scholar
  16. Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.CrossRefGoogle Scholar
  17. Ikeda, S., & Sugiyama, A. (1999). An adaptive noise canceller with low signal distortion in the present of crosstalk. In IEICE Transactions on Fundamentals, Vol. 82.a, No. 8.Google Scholar
  18. ITU-T P.835.2003. (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm. ITU-T Recommendation, p. 835.Google Scholar
  19. Lee, S., Han, D. K., & Ko, H. (2017). Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities. Applied Acoustics, 117(B), 257–262.CrossRefGoogle Scholar
  20. Loizou, P. C. (2013). Speech enhancement: Theory and practice (2nd Ed.). Boca Raton: Taylor & Francis.CrossRefGoogle Scholar
  21. Loizou, P. C., & Kim, G. (2011). Reasons why current speech-enhancement algorithms do not improve speech inelligibility and suggested solutions. IEEE Transactions on Audio, Speech, and Language Processing. 19(1), 47–56.CrossRefGoogle Scholar
  22. Lotter, T., Benien, C., & Vary, P. (2003). Multichannel speech enhancement using Bayesian spectral amplitude estimation. In Proceedings of ICASSP, Hong-Kong, pp. 20–24.Google Scholar
  23. Mak, M. W., Yu, H. B. (2014). A study of voice activity detection techniques for NIST speaker recognition evaluations. Computer Speech and Language, 28(1), 295–313.CrossRefGoogle Scholar
  24. Marro, C., Mahieux, Y., & Simmer, K. U. (1998). Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering. IEEE Transactions on Speech and Audio Processing, 6(3), 240–259.CrossRefGoogle Scholar
  25. Meyer, J., Uwe, K. (1997). Simmer multi-channel speech enhancement in a car environment using wiener filtering and spectral subtraction. In Proceedings of ICASSP, IEEE, pp. 1–4.Google Scholar
  26. Mildner, V., Goetze, S., Kammeyer, K.-D. (2006). Multi-channel speech enhancement using a psychoacoustic approach for a post-filter. In Proceedings of ITG-Fachtagung Sprachkommunikation, Kiel, Germany, pp. 1–4.Google Scholar
  27. Mukherjee, H., Obaidullah, S. M., & Phadikar., S. (2018a). MISNA—A musical instrument segregation system from noisy audio with LPCC-S features and extreme learning. Multemedia Tools Applications.  https://doi.org/10.1007/s11042-018-5993-6.Google Scholar
  28. Mukherjee, H., Obaidullah, S. M., Santosh, K. C. (2018b). Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal. International Journal on Speech Technology,  https://doi.org/10.1007/s10772-018-9525-6.Google Scholar
  29. Qingning, Z., & Waleed, A. (2006). Speech enhancement by multi-channel crosstalk resistant adaptive noise cancellation. In Proceedings of IEEE ICASS, Vol. 1, pp. 485–488.Google Scholar
  30. Roy, S. K., Zhu, W. P., & Champagne, B. (2016). Single channel speech enhancement using subband iterative Kalman filter. In IEEE International Symposium on Circuits and Systems (ISCAS), pp. 22–26.Google Scholar
  31. Sandoval-Ibarra, Y., Diaz-Ramirez, V. H., & Kober, V. I. (2016). Speech enhancement with adaptive spectral estimators. Journal of Communications Technology and Electronics. 61(6), 672–678.CrossRefGoogle Scholar
  32. Sato, M., Sugiyama, A., & Ohnaka, A. (2005). An adaptive noise canceller with low signal-distortion based on variable step size sub filter for human-robot communication. In IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. e88-a, No. 8, pp. 2055–2061.Google Scholar
  33. Sayed, A. H. (2003). Fundamentals of adaptive filtering. New York: Wiley.Google Scholar
  34. Senthamizh Selvi, R., & Suresh, G. R., Kanaga Suba Raj, S. (2017). Speech enhancement using harmonic-model with multichannel Wiener Filter. Journal of Advanced Research in Dynamical and Control Systems, 9(3), 48–54.Google Scholar
  35. Upadhyay, N., Jaiswal, K. (2016). Single channel speech enhancement: Using Wiener filtering with recursive noise estimation. Procedia Computer Science, 84, 22–30.CrossRefGoogle Scholar
  36. Upadhyay, N., & Karmakar, A. (2015). Speech Enhancement using spectral subtraction-type algorithms: A comparison and simulation study. In Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015). Procdia Computer Science. Vol. 4, pp. 574–584.Google Scholar
  37. Vajda, S., & Santosh, K. C. (2017). A fast k-nearest neighbor classifier using unsupervised clustering. In Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, Vol. 709, pp. 185–193. Singapore: Springer.Google Scholar
  38. Van Gerven, S., & Van Compernolle, D. (1995). Signal separation by symmetric adaptive decorrelation: Stability, convergence, and uniqueness. IEEE Transactions on Signal Processing, 74(3), 1602–1612.CrossRefGoogle Scholar
  39. Varga, A., & Steeneken, H. J. (1993). Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12(3), 247–251.CrossRefGoogle Scholar
  40. Vlaj, D., Kačič, Z., & Kos, M. (2012). Voice activity detection algorithm using nonlinear spectral weights, hangover and hang before criteria. Computers and Electrical Engineering, 38(6), 1820–1836.CrossRefGoogle Scholar
  41. Wang, X., Guo, Y., Fu, Q., & Yan, Y. (2016). Speech enhancement using multi-channel post-filtering with modified signal presence probability in reverberant environment. Chinese Journal of Electronics, 25(3), 512–519.CrossRefGoogle Scholar
  42. Zhang, J., Wu, X., & Shengs, V. S. (2015). Active learning with imbalanced multiple noisy labeling. IEEE Transactions on Cybernetics, 45(5), 1095–1107.CrossRefGoogle Scholar
  43. Zoulikha, M., & Djendi, M. (2016). A new regularized forward blind source separation algorithm for automatic speech quality enhancement. Applied Acoustics, 112, 192–200.CrossRefGoogle Scholar
  44. Zue, V., Seneff, S., & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond. Speech Communication, 9(4), 351–356.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Signal Processing and Image Laboratory (LATSI), University of Blida 1BlidaAlgeria

Personalised recommendations