Speaker Verification Using Adapted User-Dependent Multilevel Fusion

  • Julian Fierrez-Aguilar
  • Daniel Garcia-Romero
  • Javier Ortega-Garcia
  • Joaquin Gonzalez-Rodriguez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3541)


In this paper we study the application of user-dependent score fusion to multilevel speaker recognition. After reviewing related works in multimodal biometric authentication, a new score fusion technique is described. The method is based on a form of Bayesian adaptation to derive the personalized fusion functions from prior user-independent data. Experimental results are reported using the MIT Lincoln Laboratory’s multilevel speaker verification system. It is experimentally shown that the proposed adapted fusion method outperforms both user independent and non-adapted user-dependent fusion approaches.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  2. 2.
    Campbell, W.M.: A SVM/HMM system for speaker recognition. In: Proc. ICASSP, pp. 209–302 (2003)Google Scholar
  3. 3.
    Campbell, W.M., Reynolds, D.A., Campbell, J.: Fusing discriminative and generative methods for speaker recognition: Experiments on Switchboard and NFI/TNO field data. In: Proc. ODYSSEY, pp. 41–44 (2004)Google Scholar
  4. 4.
    Reynolds, D.A., et al.: The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition. In: Proc. ICASSP, pp. 784–787 (2003)Google Scholar
  5. 5.
    Reynolds, D.A., et al.: The 2004 MIT Lincoln Laboratory Speaker Recognition System. In: Proc. ICASSP (2005) (to appear) Google Scholar
  6. 6.
  7. 7.
    Doddington, G., et al.: Sheeps, goats, lambs and wolves: A statistical analysis of speaker performance in the NIST 1998 SRE. In: Proc. ICSLP (1998)Google Scholar
  8. 8.
    Bigun, E.S., Bigun, J., et al.: Expert conciliation for multi modal person authentication systems by Bayesian statistics. In: Bigün, J., Borgefors, G., Chollet, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 291–300. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  9. 9.
    Jain, A.K., Ross, A.: Learning user-specific parameters in a multibiometric system. In: Proc. ICIP, pp. 57–60 (2002)Google Scholar
  10. 10.
    Fierrez-Aguilar, J., et al.: A comparative evaluation of fusion strategies for multimodal biometric verification. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 830–837. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  11. 11.
    Fierrez-Aguilar, J., et al.: Exploiting general knowledge in user-dependent fusion strategies for multimodal biometric verification. In: Proc. ICASSP, pp. 617–620 (2004)Google Scholar
  12. 12.
    Toh, K.A., Jiang, X., Yau, W.Y.: Exploiting local and global decisions for multimodal biometrics verification. IEEE Trans. on SP 52, 3059–3072 (2004)CrossRefGoogle Scholar
  13. 13.
    Fierrez-Aguilar, J., et al.: Bayesian adaptation for user-dependent multimodal biometric authentication. Pattern Recognition (2005) (to appear)Google Scholar
  14. 14.
    Kumar, A., Zhang, D.: Integrating palmprint with face for user authentication. In: Proc. MMUA (2003), available at http://mmua.cs.ucsb.edu/
  15. 15.
    Snelick, R., et al.: Large scale evaluation of multimodal biometric authentication using state-of-the-art systems. IEEE Trans. PAMI 27, 450–455 (2005)Google Scholar
  16. 16.
    Poh, N., Bengio, S.: An Investigation of F-ratio client-dependent normalisation on biometric authentication tasks. In: Proc. ICASSP (2005) (to appear) Google Scholar
  17. 17.
    Lee, C.H., Huo, Q.: On adaptive decision rules and decision parameter adaptation for automatic speech recognition. Proc. IEEE, 88, 1241–1269 (2000)Google Scholar
  18. 18.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Chichester (2001)MATHGoogle Scholar
  19. 19.
    Gauvain, J.L., Lee, C.H.: Maximum a Posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. on SAP 2, 291–298 (1994)Google Scholar
  20. 20.
    Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: Proc. ICASSP, pp. 53–56 (2003)Google Scholar
  21. 21.
    Auckenthaler, R., et al.: Score normalization for text-independent speaker verification systems. Digital Signal Processing 10, 42–54 (2000)CrossRefGoogle Scholar
  22. 22.
    Doddington, G.: Speaker recognition based on idiolectal differences between speakers. In: Proc. EUROSPEECH, pp. 2521–2524 (2001)Google Scholar
  23. 23.
    Adami, A., Mihaescu, R., Reynolds, D.A., Godfrey, J.: Modeling prosodic dynamics for speaker recognition. In: Proc. ICASSP, pp. 788–791 (2003)Google Scholar
  24. 24.
    Adami, A.G.: Modeling prosodic differences for speaker and language recognition. PhD thesis, OGI (2004)Google Scholar
  25. 25.
    Martin, A., Doddington, G., et al.: The DET curve in assessment of decision task performance. In: Proc. EUROSPEECH 1997, pp. 1895–1898 (1997)Google Scholar
  26. 26.
    Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. on PAMI 22, 4–37 (2000)Google Scholar
  27. 27.
    Fierrez-Aguilar, J., Ortega-Garcia, J., Gonzalez-Rodriguez, J.: Target dependent score normalization techniques and their application to signature verification. IEEE Trans. on SMC-C 35 (2005) (to appear)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Julian Fierrez-Aguilar
    • 1
  • Daniel Garcia-Romero
    • 1
  • Javier Ortega-Garcia
    • 1
  • Joaquin Gonzalez-Rodriguez
    • 1
  1. 1.Biometrics Research Lab./ATVS, Escuela Politecnica SuperiorUniversidad Autonoma de MadridMadridSpain

Personalised recommendations