Support Vector Machine Regression for Robust Speaker Verification in Mismatching and Forensic Conditions

  • Ismael Mateos-Garcia
  • Daniel Ramos
  • Ignacio Lopez-Moreno
  • Joaquin Gonzalez-Rodriguez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5558)


In this paper we propose the use of Support Vector Machine Regression (SVR) for robust speaker verification in two scenarios: i) strong mismatch in speech conditions and ii) forensic environment. The proposed approach seeks robustness to situations where a proper background database is reduced or not present, a situation typical in forensic cases which has been called database mismatch. For the mismatching condition scenario, we use the NIST SRE 2008 core task as a highly variable environment, but with a mostly representative background set coming from past NIST evaluations. For the forensic scenario, we use the Ahumada III database, a public corpus in Spanish coming from real authored forensic cases collected by Spanish Guardia Civil. We show experiments illustrating the robustness of a SVR scheme using a GLDS kernel under strong session variability, even when no session variability is applied, and especially in the forensic scenario, under database mismatch.


Speaker verification forensic GLDS SVM classification SVM regression session variability compensation robustness 


  1. 1.
    National Institute of Standards and Technology (NIST), 2008 speaker recognition evaluation plan (2008), Google Scholar
  2. 2.
    Reynolds, D.A.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000) Google Scholar
  3. 3.
    Campbell, W.M., Quatieri, T.F., Dunn, R.B.: Support Vector Machines for Speaker and language Recognition. Computer Speech and Language 20, 210–229 (2006) Google Scholar
  4. 4.
    Solomonoff, A., Campbell, W.M., Boardman, I.: Advances in Channel Compensation for SVM Speaker Recognition. In: Proc. Of ICASSP, pp. 629–632 (2005) Google Scholar
  5. 5.
    Kenny, P., Oullet, P., Dehak, N., Gupta, V., Dumouchel, P.: A Study of Inter-Speaker Variability in Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing 16(5), 980–988 (2008) Google Scholar
  6. 6.
    Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support Vector Machines using GMM Supervectors for Speaker Verification. Signal Processing Letters 13(5), 308–311 (2006) Google Scholar
  7. 7.
    Brümmer, N., et al.: Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006. IEEE Transactions on Audio, Speech and Language Processing 15(7), 2072–2084 (2007) Google Scholar
  8. 8.
    Ramos, D., Gonzalez-Rodriguez, J., Gonzalez-Dominguez, J., Lucena-Molina, J.J.: Addressing Database Mismatch in Forensic Speaker Recognition with Ahumada III: a Public Real-Casework Database in Spanish. In: Proc. Of Interspeech, pp. 1493–1496 (2008) Google Scholar
  9. 9.
    Lopez-Moreno, I., Mateos-Garcia, I., Ramos, D., Gonzalez-Rodriguez, J.: Support Vector Regression for Speaker Verification. In: Proc. Of Interspeech, pp. 306–309 (2007) Google Scholar
  10. 10.
    Smola, A.J., Schoelkopf, B.: A Tutorial on Support Vector Regression. Tech. Rep. NeuroCOLT2 Technical Report NC2-TR-1998-030, Royal Holloway College (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ismael Mateos-Garcia
    • 1
  • Daniel Ramos
    • 1
  • Ignacio Lopez-Moreno
    • 1
  • Joaquin Gonzalez-Rodriguez
    • 1
  1. 1.ATVS – Biometric Recognition Group, Escuela Politecnica SuperiorUniversidad Autonoma de MadridMadridSpain

Personalised recommendations