Abstract
Speech signals that recorded in the far field or with a distant microphone typically comprise additive noise and reverberation, which cause degradation and distortion in the reliability and intelligibility of speech signal, and the recognition performance of speaker recognition systems, with severe consequences in a wide range of real applications. Channel equalization, i.e. the removal or reduction or other cleaning methods of the channel effects, to some extent, mitigates the mismatching problem at the cost of added distortions to the vulnerable speech signal themselves, and therefore, its effectiveness is limited. This paper proposed to estimate the reverberation first and incorporate them into individual training examples to create virtually matched channels. The training process is performed before the final decision-making. In the training stage, the selection training target model out of the dataset of models that are trained in different reverberate environments and then using acoustic matched models for the reverberate in the test stage. The best matching model is selected by blindly estimating the full band reverberation time RT using maximum likelihood. Speaker recognition experiments in the artificial and real reverberate conditions show the efficiency of the proposed method in terms of decreased equal error rate EER and detection error trade-off DET.
Similar content being viewed by others
References
Al-Karawi K (2018) Robust speaker recognition in reverberant condition-toward greater biometric security. University of Salford, Salford
Al-Karawi KA (2019) Robustness speaker recognition based on feature space in clean and noisy condition. Int J Sensors Wirel Commun Control 9:1–10
Al-Karawi KA (2020) Mitigate the reverberation effect on the speaker verification performance using different methods. Int J Speech Technol 1–11
Al-Karawi KA, Li F (2017) Robust speaker verification in reverberant conditions using estimated acoustic parameters—a maximum likelihood estimation and training on the fly approach. In: 2017 seventh international conference on innovative computing technology (INTECH). IEEE
Al-Karawi KA, Mohammed DY (2019) Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. Int J Speech Technol 22(4):1077–1084
Al-Noori AH, Al-Karawi KA, Li FF (2015) Improving robustness of speaker recognition in noisy and reverberant conditions via training. In: Intelligence and security informatics conference (EISIC), 2015 European. IEEE
Allen JB, Berkley DA (1979) Image method for efficiently simulating small-room acoustics. J Acoust Soc Am 65(4):943–950
CATT-Acoustic (2010) v8.0c, Room acoustic modelling software. [cited 2016 18 October] Available from: http://www.catt.se
Chen Y-W, Lin C-J (2006) Combining SVMs with various feature selection strategies. In: Feature extraction. Springer, p 315–324
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366
Dehak N et al (2009) Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: Tenth Annual conference of the international speech communication association
Ganapathy S, Pelecanos J, Omar MK (2011) Feature normalization for speaker verification in room reverberation. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2011. IEEE
Gaubitch ND et al (2012) Performance comparison of algorithms for blind reverberation time estimation from speech. In: IWAENC 2012; international workshop on acoustic signal enhancement. VDE
González-Rodríguez J, et al. (1996) Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In: Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. 1996. IEEE
Jeub M, Schafer M, Vary P (2009) A binaural room impulse response database for the evaluation of dereverberation algorithms. In: 2009 16th international conference on digital signal processing. IEEE
Kuttruff H (2000) Room acoustics, ed. 4th. Spon Press, London,[England] New York, NY
Leng L et al (2017) Dual-source discrimination power analysis for multi-instance contactless palmprint recognition. Multimed Tools Appl 76(1):333–354
Li FF (2016) Robust speaker recognition by means of acoustic transmission channel matching: an acoustic parameter estimation approach. In: 2016 sixth international conference on innovative computing technology (INTECH). IEEE
Löllmann H et al (2010) An improved algorithm for blind reverberation time estimation. In: Proceedings of international workshop on acoustic echo and noise control (IWAENC)
Löllmann HW, Vary P (2008) Estimation of the reverberation time in noisy environments. In: Proceedings of international workshop on acoustic echo and noise control
Mammone RJ, Zhang X, Ramachandran RP (1996) Robust speaker recognition: a feature-based approach. IEEE Signal Process Mag 13(5):58
Ming J et al (2007) Robust speaker recognition in noisy conditions. IEEE Trans Audio Speech Lang Process 15(5):1711–1723
Mohammed DY et al (2020) Mitigate the reverberant effects on speaker recognition via multi-training. Springer, Cham
Muda L, Begam M, Elamvazuthi I (2010) Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv:1003.4083
Ning W et al (2011) Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Trans Audio Speech Lang Process 19(1):196–205
Probability RV (2002) Stochastic processes, A. Papoulis and SU Pillai. McGraw Hill
Ratnam R et al (2003) Blind estimation of reverberation time. J Acoust Soc Am 114(5):2877–2892
Ravanelli M et al (2012) Impulse response estimation for robust speech recognition in a reverberant environment. In: 2012 Proceedings of the 20th European signal processing conference (EUSIPCO). IEEE
Sabine WC (1922) Collected papers on Acoustics, prepared by T. J. Lyman, (Reprinted by Dover, New York, 1964)
Sadjadi SO, Hansen JH (2012) Blind reverberation mitigation for robust speaker identification. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2012. IEEE
Sadjadi SO, Slaney M, Heck L (2013) MSR identity toolbox v1. 0: a MATLAB toolbox for speaker-recognition research. Speech Lang Process Tech Committee Newsl
Wang L, Nakagawa S (2009) Speaker identification/verification for reverberant speech using phase information. In: Proceedings of WESPAC 2009, 2009(0130)
Zhang Y et al (2006) Blind estimation of reverberation time in occupied rooms. In: 2006 14th European signal processing conference. IEEE
Zhao X, Wang Y, Wang D (2014) Robust speaker identification in noisy and reverberant conditions
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Al-Karawi, K.A., Ahmed, S.T. Model selection toward robustness speaker verification in reverberant conditions. Multimed Tools Appl 80, 36549–36566 (2021). https://doi.org/10.1007/s11042-021-11356-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11356-3