Skip to main content
Log in

Model selection toward robustness speaker verification in reverberant conditions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Speech signals that recorded in the far field or with a distant microphone typically comprise additive noise and reverberation, which cause degradation and distortion in the reliability and intelligibility of speech signal, and the recognition performance of speaker recognition systems, with severe consequences in a wide range of real applications. Channel equalization, i.e. the removal or reduction or other cleaning methods of the channel effects, to some extent, mitigates the mismatching problem at the cost of added distortions to the vulnerable speech signal themselves, and therefore, its effectiveness is limited. This paper proposed to estimate the reverberation first and incorporate them into individual training examples to create virtually matched channels. The training process is performed before the final decision-making. In the training stage, the selection training target model out of the dataset of models that are trained in different reverberate environments and then using acoustic matched models for the reverberate in the test stage. The best matching model is selected by blindly estimating the full band reverberation time RT using maximum likelihood. Speaker recognition experiments in the artificial and real reverberate conditions show the efficiency of the proposed method in terms of decreased equal error rate EER and detection error trade-off DET.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Al-Karawi K (2018) Robust speaker recognition in reverberant condition-toward greater biometric security. University of Salford, Salford

    Google Scholar 

  2. Al-Karawi KA (2019) Robustness speaker recognition based on feature space in clean and noisy condition. Int J Sensors Wirel Commun Control 9:1–10

    Article  Google Scholar 

  3. Al-Karawi KA (2020) Mitigate the reverberation effect on the speaker verification performance using different methods. Int J Speech Technol 1–11

  4. Al-Karawi KA, Li F (2017) Robust speaker verification in reverberant conditions using estimated acoustic parameters—a maximum likelihood estimation and training on the fly approach. In: 2017 seventh international conference on innovative computing technology (INTECH). IEEE

  5. Al-Karawi KA, Mohammed DY (2019) Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. Int J Speech Technol 22(4):1077–1084

    Article  Google Scholar 

  6. Al-Noori AH, Al-Karawi KA, Li FF (2015) Improving robustness of speaker recognition in noisy and reverberant conditions via training. In: Intelligence and security informatics conference (EISIC), 2015 European. IEEE

  7. Allen JB, Berkley DA (1979) Image method for efficiently simulating small-room acoustics. J Acoust Soc Am 65(4):943–950

    Article  Google Scholar 

  8. CATT-Acoustic (2010) v8.0c, Room acoustic modelling software. [cited 2016 18 October] Available from: http://www.catt.se

  9. Chen Y-W, Lin C-J (2006) Combining SVMs with various feature selection strategies. In: Feature extraction. Springer, p 315–324

  10. Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366

    Article  Google Scholar 

  11. Dehak N et al (2009) Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: Tenth Annual conference of the international speech communication association

  12. Ganapathy S, Pelecanos J, Omar MK (2011) Feature normalization for speaker verification in room reverberation. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2011. IEEE

  13. Gaubitch ND et al (2012) Performance comparison of algorithms for blind reverberation time estimation from speech. In: IWAENC 2012; international workshop on acoustic signal enhancement. VDE

  14. González-Rodríguez J, et al. (1996) Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In: Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. 1996. IEEE

  15. Jeub M, Schafer M, Vary P (2009) A binaural room impulse response database for the evaluation of dereverberation algorithms. In: 2009 16th international conference on digital signal processing. IEEE

  16. Kuttruff H (2000) Room acoustics, ed. 4th. Spon Press, London,[England] New York, NY

  17. Leng L et al (2017) Dual-source discrimination power analysis for multi-instance contactless palmprint recognition. Multimed Tools Appl 76(1):333–354

    Article  Google Scholar 

  18. Li FF (2016) Robust speaker recognition by means of acoustic transmission channel matching: an acoustic parameter estimation approach. In: 2016 sixth international conference on innovative computing technology (INTECH). IEEE

  19. Löllmann H et al (2010) An improved algorithm for blind reverberation time estimation. In: Proceedings of international workshop on acoustic echo and noise control (IWAENC)

  20. Löllmann HW, Vary P (2008) Estimation of the reverberation time in noisy environments. In: Proceedings of international workshop on acoustic echo and noise control

  21. Mammone RJ, Zhang X, Ramachandran RP (1996) Robust speaker recognition: a feature-based approach. IEEE Signal Process Mag 13(5):58

    Article  Google Scholar 

  22. Ming J et al (2007) Robust speaker recognition in noisy conditions. IEEE Trans Audio Speech Lang Process 15(5):1711–1723

    Article  Google Scholar 

  23. Mohammed DY et al (2020) Mitigate the reverberant effects on speaker recognition via multi-training. Springer, Cham

    Book  Google Scholar 

  24. Muda L, Begam M, Elamvazuthi I (2010) Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv:1003.4083

  25. Ning W et al (2011) Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Trans Audio Speech Lang Process 19(1):196–205

    Article  Google Scholar 

  26. Probability RV (2002) Stochastic processes, A. Papoulis and SU Pillai. McGraw Hill

  27. Ratnam R et al (2003) Blind estimation of reverberation time. J Acoust Soc Am 114(5):2877–2892

    Article  Google Scholar 

  28. Ravanelli M et al (2012) Impulse response estimation for robust speech recognition in a reverberant environment. In: 2012 Proceedings of the 20th European signal processing conference (EUSIPCO). IEEE

  29. Sabine WC (1922) Collected papers on Acoustics, prepared by T. J. Lyman, (Reprinted by Dover, New York, 1964)

  30. Sadjadi SO, Hansen JH (2012) Blind reverberation mitigation for robust speaker identification. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2012. IEEE

  31. Sadjadi SO, Slaney M, Heck L (2013) MSR identity toolbox v1. 0: a MATLAB toolbox for speaker-recognition research. Speech Lang Process Tech Committee Newsl

  32. Wang L, Nakagawa S (2009) Speaker identification/verification for reverberant speech using phase information. In: Proceedings of WESPAC 2009, 2009(0130)

  33. Zhang Y et al (2006) Blind estimation of reverberation time in occupied rooms. In: 2006 14th European signal processing conference. IEEE

  34. Zhao X, Wang Y, Wang D (2014) Robust speaker identification in noisy and reverberant conditions

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khamis A. Al-Karawi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Karawi, K.A., Ahmed, S.T. Model selection toward robustness speaker verification in reverberant conditions. Multimed Tools Appl 80, 36549–36566 (2021). https://doi.org/10.1007/s11042-021-11356-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11356-3

Keywords

Navigation