Advertisement

Speaker Identification and Verification Using Support Vector Machines and Sparse Kernel Logistic Regression

  • Marcel Katz
  • Sven E. Krüger
  • Martin Schafföner
  • Edin Andelic
  • Andreas Wendemuth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4153)

Abstract

In this paper we investigate two discriminative classification approaches for frame-based speaker identification and verification, namely Support Vector Machine (SVM) and Sparse Kernel Logistic Regression (SKLR). SVMs have already shown good results in regression and classification in several fields of pattern recognition as well as in continuous speech recognition. While the non-probabilistic output of the SVM has to be translated into conditional probabilities, the SKLR produces the probabilities directly.

In speaker identification and verification experiments both discriminative classification methods outperform the standard Gaussian Mixture Model (GMM) system on the POLYCOST database.

Keywords

Support Vector Machine Gaussian Mixture Model Equal Error Rate Speaker Recognition Speaker Veri 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Reynolds, D.: An overview of automatic speaker recognition technology. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 4072–4075 (2002)Google Scholar
  2. 2.
    Krüger, S.E., Schafföner, M., Katz, M., Andelic, E., Wendemuth, A.: Speech recognition with support vector machines in a hybrid system. In: Proc. EuroSpeech, pp. 993–996 (2005)Google Scholar
  3. 3.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Kearns, M., Solla, S., Cohn, D. (eds.) Advances in Neural Information Processing Systems, vol. 11, pp. 487–493. MIT Press, Cambridge (1999)Google Scholar
  4. 4.
    Wan, V., Renals, S.: Speaker verification using sequence discriminant support vector machines. IEEE Transactions on Speech and Audio Processing 13(2), 203–210 (2005)CrossRefGoogle Scholar
  5. 5.
    Przybocki, M., Martin, A.: NIST speaker recognition evaluation chronicles. In: Proceedings of ODYSSEY - The Speaker and Language Recognition Workshop (2004)Google Scholar
  6. 6.
    Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  7. 7.
    Vapnik, V.N.: The Nature of Statistical Learning Theory, 2nd edn. Information Science and Statistics. Springer, Berlin (2000)MATHGoogle Scholar
  8. 8.
    Burges, C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
  9. 9.
    Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Bartlett, P., Schölkopf, B., Schuurmans, D., Smola, A. (eds.) Advances in Large-Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000), available: http://research.microsoft.com/~jplatt/abstracts/SVprob.html Google Scholar
  10. 10.
    Hoerl, A., Kennard, R.: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970)MATHCrossRefGoogle Scholar
  11. 11.
    Zhu, J., Hastie, T.: Kernel logistic regression and the import vector machine. Journal of Computational and Graphical Statistics 14, 185–205 (2005)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems 10. MIT Press, Cambridge (1998)Google Scholar
  13. 13.
    Price, D., Knerr, S., Personnaz, L., Dreyfus, G.: Pairwise neural network classifiers with probabilistic outputs. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems 7, pp. 1109–1116. MIT Press, Cambridge (1995)Google Scholar
  14. 14.
    Melin, H., Lindberg, J.: Guidelines for experiments on the polycost database. In: Proceedings of a COST 250 workshop on Application of Speaker Recognition Techniques in Telephony, Vigo, Spain, pp. 59–69 (1996)Google Scholar
  15. 15.
    Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Department, Cambridge (2002)Google Scholar
  16. 16.
    Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The det curve in assessment of detection task performance. In: Proc. EuroSpeech, vol. 4, pp. 1895–1898 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Marcel Katz
    • 1
  • Sven E. Krüger
    • 1
  • Martin Schafföner
    • 1
  • Edin Andelic
    • 1
  • Andreas Wendemuth
    • 1
  1. 1.IESK, Cognitive SystemsUniversity of MagdeburgGermany

Personalised recommendations