Skip to main content

Mitigate the Reverberant Effects on Speaker Recognition via Multi-training

  • Conference paper
  • First Online:
Applied Computing to Support Industry: Innovation and Technology (ACRIT 2019)

Abstract

Speaker recognition techniques have been developed into a relatively mature status over the past few decades through continuous research and development work. Existing methods typically use robust features extracted from clean speech signals, and therefore in idealized conditions can achieve very high recognition accuracy. For critical applications, such as security forensics robustness and reliability of the system is crucial. The reverberation condition can be represented by two main parameters namely Reverberation Time (RT) and Direct to Reverberation Ratio (DRR) (which represent the distance of the microphone to the source). This paper presents an efficient method to mitigating or at least alleviates the impacts of reverberation upon speaker verification. Multi-condition training approaches are investigated to alleviate such detrimental effects. Three multi-condition training methods are then investigated to mitigate such detrimental effects. The first uses matched train/test speaker models based on estimated reverberation time (RT) values. The second utilizes two-condition training where clean and reverberant models are used. Lastly, a four-condition training setup is proposed and conducted to improve the system performance. The utilized data set building, for SV experiments, training, and speech test material are obtained from the University of Salford Anechoic chamber database (SALU-AC). Experimental results show the first and the last types of multi-condition training providing significant gains in performance relative to the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Furui, S.: Cepstral analysis technique for automatic speaker verification. IEEE Trans. Acoust. Speech Signal Process. 29, 254–272 (1981)

    Article  Google Scholar 

  2. Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech Audio Process. 2, 578–589 (1994)

    Article  Google Scholar 

  3. Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2003), vol. 2, pp. II-53-6 (2003)

    Google Scholar 

  4. Ganapathy, S., Pelecanos, J., Omar, M.K.: Feature normalization for speaker verification in room reverberation. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4836–4839 (2011)

    Google Scholar 

  5. Jin, Q., Schultz, T., Waibel, A.: Far-field speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15, 2023–2032 (2007)

    Article  Google Scholar 

  6. González-Rodríguez, J., Ortega-García, J., Martín, C., Hernández, L.: Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In: 1996 Proceedings of the Fourth International Conference on Spoken Language, ICSLP 1996, pp. 1333–1336 (1996)

    Google Scholar 

  7. Peer, I., Rafaely, B., Zigel, Y.: Reverberation matching for speaker recognition. In: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008, pp. 4829–4832 (2008)

    Google Scholar 

  8. Sadjadi, S.O., Hansen, J.H.: Hilbert envelope based features for robust speaker identification under reverberant mismatched conditions. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5448–5451 (2011)

    Google Scholar 

  9. Falk, T.H., Chan, W.-Y.: Modulation spectral features for robust far-field speaker identification. IEEE Trans. Audio Speech Lang. Process. 18, 90–100 (2010)

    Article  Google Scholar 

  10. Falk, T.H., Chan, W.-Y.: Spectro-temporal features for robust far-field speaker identification. In: INTERSPEECH, pp. 634–637 (2008)

    Google Scholar 

  11. Gammal, J.S., Goubran, R.A.: Combating reverberation in speaker verification. In: 2005 Proceedings of the IEEE Instrumentation and Measurement Technology Conference, IMTC 2005, pp. 687–690 (2005)

    Google Scholar 

  12. Zhao, X., Wang, Y., Wang, D.: Robust speaker identification in noisy and reverberant conditions (2014)

    Google Scholar 

  13. Ming, J., Hazen, T.J., Glass, J.R., Reynolds, D.A.: Robust speaker recognition in noisy conditions. IEEE Trans. Audio Speech Lang. Process. 15, 1711–1723 (2007)

    Article  Google Scholar 

  14. Wang, N., Ching, P., Zheng, N., Lee, T.: Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Trans. Audio Speech Lang. Process. 19, 196–205 (2011)

    Article  Google Scholar 

  15. Sadjadi, S.O., Slaney, M., Heck, L.: MSR identity toolbox v1. 0: a MATLAB toolbox for speaker-recognition research. Speech and Language Processing Technical Committee Newsletter (2013)

    Google Scholar 

  16. Kinnunen, T., Koh, C., Wang, L., Li, H., Chng, E.: Temporal discrete cosine transform: towards longer term temporal features for speaker verification. In: Proceedings Fifth International Symposium on Chinese Spoken Language Processing (ISCSLP 2006), Singapore, pp. 547–558 (2006)

    Google Scholar 

  17. Turk, U., Schiel, F.: Speaker verification based on the German VeriDat database. In: Eighth European Conference on Speech Communication and Technology (2003)

    Google Scholar 

  18. Larcher, A., Bonastre, J.-F., Fauve, B.G., Lee, K.-A., Lévy, C., Li, H., et al.: ALIZE 3.0-open source toolkit for state-of-the-art speaker recognition. In: INTERSPEECH, pp. 2768–2772 (2013)

    Google Scholar 

  19. Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997)

    Article  Google Scholar 

  20. Rose, R.C., Reynolds, D.A.: Text-independent speaker identification using automatic acoustic segmentation. In: 1990 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1990, pp. 293–296 (1990)

    Google Scholar 

  21. International Standard: 3382. Acoustics–measurement of the reverberation time of rooms with reference to other acoustical parameters. International Standards Organization (1997)

    Google Scholar 

  22. Doddington, G.R., Przybocki, M.A., Martin, A.F., Reynolds, D.A.: The NIST speaker recognition evaluation–overview, methodology, systems, results, perspective. Speech Commun. 31, 225–254 (2000)

    Article  Google Scholar 

  23. Chen, Y.-W., Lin, C.-J.: Combining SVMs with various feature selection strategies. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction. STUDFUZZ, vol. 207, pp. 315–324. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-540-35488-8_13

    Chapter  Google Scholar 

  24. El Bachir, T., Benabbou, A., Harti, M.: Design of an automatic speaker recognition system based on adapted MFCC and GMM methods for Arabic speech. Int. J. Comput. Sci. Netw. Secur. 10 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duraid Y. Mohammed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mohammed, D.Y., Al-Karawi, K.A., Husien, I.M., Ghulam, M.A. (2020). Mitigate the Reverberant Effects on Speaker Recognition via Multi-training. In: Khalaf, M., Al-Jumeily, D., Lisitsa, A. (eds) Applied Computing to Support Industry: Innovation and Technology. ACRIT 2019. Communications in Computer and Information Science, vol 1174. Springer, Cham. https://doi.org/10.1007/978-3-030-38752-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-38752-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-38751-8

  • Online ISBN: 978-3-030-38752-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics