Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions
- 1 Downloads
A speech signal captured by a distant microphone is generally smeared by reverberation, which severely degrades automatic speaker recognition performance. To improve system performance, an effective and robust method is proposed to extract features for speech processing. In this paper, a room impulse response is presumed to comprise of three parts: a direct-path response, early reflections and late reverberations. Since late reverberations are known to be a major cause of system performance degradation, this paper focuses on dealing with the effect of early reflection because the early reflections and their properties play a necessary role within the acoustics of an enclosure. The proposed method first estimates the early reflection using autocorrelation function from the presentation of speech signals in the first stage, the estimates are combined with an anechoic signal for use into training the system in the second stage. The employed method looks to be promising, achieving a substantial improvement in system performance relating to reduced equal error rate and detection trade-off, especially at longer reverberation time.
KeywordsSpeaker recognition Early reflection Autocorrelation function GMM GFCC
- Al-Karawi, K. A., & Li, F. (2017). Robust speaker verification in reverberant conditions using estimated acoustic parameters: A maximum likelihood estimation and training on the fly approach. 2017 Seventh International Conference on Innovative Computing Technology (INTECH) (pp. 52–57).Google Scholar
- Al-Noori, A. H., Al-Karawi, K. A., & Li, F. (2015). Improving robustness of speaker recognition in noisy and reverberant conditions via training. 2015 European Intelligence and Security Informatics Conference (EISIC) (p. 180).Google Scholar
- Bimbot, F., Bonastre, J.-F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., et al. (2004). A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing, 2004, 430–451.Google Scholar
- CATT-Acoustic. (2010). v8.0c, Room acoustic modelling software. Retrieved October 18, 2010 from http://www.catt.se.
- Defrance, G., Daudet, L., & Polack, J.-D. (2008). Detecting arrivals within room impulse responses using matching pursuit. Proceedings of the 11th International Conference on Digital Audio Effects (DAFx-08), Espoo, Finland (pp. 307–316).Google Scholar
- Jeub, M., Schafer, M., & Vary, P. (2009). A binaural room impulse response database for the evaluation of dereverberation algorithms. 2009 16th International Conference on Digital Signal Processing (pp. 1–5).Google Scholar
- Kuttruff, H. (2009). Room acoustics. Boca Raton: CRC Press.Google Scholar
- Li, F. F. (2016). Robust speaker recognition by means of acoustic transmission channel matching: An acoustic parameter estimation approach. 2016 Sixth International Conference on Innovative Computing Technology (INTECH) (pp. 194–198).Google Scholar
- Loutridis, S. J. (2005). Decomposition of impulse responses using complex wavelets. Journal of the Audio Engineering Society, 53, 796–811.Google Scholar
- Sadjadi, S. O., & Hansen, J. H. (2012). Blind reverberation mitigation for robust speaker identification. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4225–4228).Google Scholar
- Sadjadi, S. O., Slaney, M., & Heck, L. (2013). MSR identity toolbox v1. 0: A MATLAB toolbox for speaker-recognition research. Speech and Language Processing Technical Committee Newsletter.Google Scholar
- Schonle, M., Fliege, N., & Zolzer, U. (1993). Parametric approximation of room impulse responses based on wavelet decomposition. 1993 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1993. Final Program and Paper Summaries (pp. 68–71).Google Scholar
- Suits, B. H. (2015). Autocorrelation (for sound signals). Retrieved March 10, 2015 from http://pages.mtu.edu/~suits/autocorrelation.html.
- Wang, L., & Nakagawa, S. (2009). Speaker identification/verification for reverberant speech using phase information. Proceedings of WESPAC 2009.Google Scholar