Scores Calibration in Speaker Recognition Systems

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811)

Abstract

It is well known that variability of speech signal quality affects the performance of speaker recognition systems. Difference in speech quality between enrollment and test utterances leads to shifting of scores and performance degradation. In order to improve the effectiveness of speaker recognition in these circumstances the scores calibration is required. Speech signal parameters that have a strong impact on speaker recognition performance are total speech duration, signal to noise ratio and reverberation time. Their variability leads to scores shifting and unreliable accept/reject decisions. In this paper we investigate the effects of speech duration variability on the calibration when enroll and test speech utterances originate from the same channel. An effective method of scores stabilization is also presented.

Keywords

Speaker recognition Calibration scores Tuning speaker recognition system 

Notes

Acknowledgments

This work was partially financially supported by the Government of the Russian Federation, Grant 074-U01.

References

  1. 1.
    Sestek, the rise of voice biometrics as a key security solution. Speech Technology Magazine, White paper of SESTEKGoogle Scholar
  2. 2.
    Averbouch, D., Kahn, J.: Fraud targets the contact center: What now? Speech Technol. Mag. 18(4), 9 (2013)Google Scholar
  3. 3.
    Batchelor, J., Lee, D., Banks, D., Crosby, D., Moore, K., Kuhn, S., Rodriguez, T., Stephens, A.: Ivestigative report. Florida Department of Law Enforcement (2012)Google Scholar
  4. 4.
    Brümmer, N.: Measuring, refining and calibrating speaker and language information extracted from speech. Ph.D. thesis, Citeseer (2010)Google Scholar
  5. 5.
    Brümmer, N., Garcia-Romero, D.: Generative modelling for unsupervised score calibration. arXiv preprint (2013). arXiv:1311.0707
  6. 6.
    Brümmer, N., de Villiers, E.: The bosaris toolkit: Theory, algorithms and code for surviving the new DCF. arXiv preprint (2013). arxiv:1304.2865
  7. 7.
    Doddington, G.: The role of score calibration in speaker recognition. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)Google Scholar
  8. 8.
    Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Interspeech, pp. 249–252 (2011)Google Scholar
  9. 9.
    Hautamäki, V., Kinnunen, T., Sedlák, F., Lee, K.A., Ma, B., Li, H.: Sparse classifier fusion for speaker verification. IEEE Trans. Audio Speech Lang. Process. 21(8), 1622–1631 (2013)CrossRefGoogle Scholar
  10. 10.
    Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20 (2004)CrossRefGoogle Scholar
  11. 11.
    Katz, M., Schafföner, M., Krüger, S.E., Wendemuth, A.: Score calibrating for speaker recognition based on support vector machines and gaussian mixture models. In: SIP. pp. 139–144 (2007)Google Scholar
  12. 12.
    Kozlov, A., Kudashev, O., Matveev, Y., Pekhovsky, T., Simonchik, K., Shulipa, A.: SVID speaker recognition system for NIST SRE 2012. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 278–285. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  13. 13.
    van Leeuwen, D.A., Brümmer, N.: The distribution of calibrated likelihood-ratios in speaker recognition. arXiv preprint (2013). arXiv:1304.1199
  14. 14.
    Mandasari, M.I., Saeidi, R., van Leeuwen, D.A.: Calibration based on duration quality measures function in noise robust speaker recognition for NIST SRE’12. Parameters 1(Q1), w2 (2013)Google Scholar
  15. 15.
    Mandasari, M.I., Saeidi, R., McLaren, M., van Leeuwen, D.A.: Quality measure functions for calibration of speaker recognition systems in various duration conditions. IEEE Trans. Audio Speech Lang. Process. 21(11), 2425–2438 (2013)CrossRefGoogle Scholar
  16. 16.
    Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The det curve in assessment of detection task performance. Technical report, DTIC Document (1997)Google Scholar
  17. 17.
    van Leeuwen, D.A., Brümmer, N.: An introduction to application-independent evaluation of speaker recognition systems. In: Müller, C. (ed.) Speaker Classification 2007. LNCS (LNAI), vol. 4343, pp. 330–353. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Villalba, J., Lleida, E., Ortega, A., Miguel, A.: A new bayesian network to assess the reliability of speaker verification decisions (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Andrey Shulipa
    • 1
  • Sergey Novoselov
    • 1
    • 2
  • Yuri Matveev
    • 1
    • 2
  1. 1.ITMO UniversitySaint PetersburgRussia
  2. 2.Speech Technology CenterSaint PetersburgRussia

Personalised recommendations