An approach combining a simple local representation method with a k-nearest neighbors-based direct voting scheme is proposed for speaker recognition. This approach rises computational problems that we effectively solved through an approximate fast k-nearest neighbors search technique. Experimental results with the EuTrans and SIVAspeech databases are reported showing the effectiveness of the proposed approach.


Speaker Recognition Local Features Nearest Neighbor 


  1. 1.
    Messer, K., Kittler, J., Sadeghi, M., Marcel, S., Marcel, C., Bengio, S., Cardinaux, F., Sanderson, C., Czyz, J., Vandendorpe, L., Srisuk, S., Petrou, M., Kurutach, W., Kadyrov, A., Paredes, R., Kepenekci, B., Tek, F.B., Akar, G.B., Deravi, F., Mavity, N.: Face Verification Competition on the XM2VTS Database. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, Springer, Heidelberg (2003)Google Scholar
  2. 2.
    Paredes, R., Perez-Cortes, J.C., Juan, A., Vidal, E.: Local Representations and a Direct Voting Scheme for Face Recognition. In: Workshop on Pattern Recognition in Information Systems, Setúbal, Portugal (July 2001)Google Scholar
  3. 3.
    Rabiner, L.R., Shafer, R.W.: Digital processing of speech signals. Prentice Hall, Englewood Cliffs (1978)Google Scholar
  4. 4.
    Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. JACM 45, 891–923 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. IEEE Trans. on PAMI 19(5), 530–535 (1997)Google Scholar
  6. 6.
    Mohr, R., Picard, S., Schmid, C.: Bayesian decision versus voting for image retrieval. In: Sommer, G., Daniilidis, K., Pauli, J. (eds.) CAIP 1997. LNCS, vol. 1296, Springer, Heidelberg (1997)CrossRefGoogle Scholar
  7. 7.
    Shyu, C., et al.: Local versus Global Features for Content-Based Image Retrieval. In: Proc. of the IEEE Workshop on Content-Based Access of Image and Video Libraries, pp. 30–34 (1998)Google Scholar
  8. 8.
    Deriche, R., Giraudon, G.: A Computational Approach to Corner and Vertex Detection. Int. Journal of Computer Vision 10, 101–124 (1993)CrossRefGoogle Scholar
  9. 9.
    Duin, R.P., Kittler, J., Hatef, M., Matas, J.: On combinig classifiers. IEEE Trasn. on PAMI (1998)Google Scholar
  10. 10.
    Liao, R., Li, S.Z.: Face Recognition Based on Multiple Facial Features. In: Proc. of the 4th IEEE Int. Conf. on Automatic Face and Gesture Recognition (2000)Google Scholar
  11. 11.
    Zhang, Z., Deriche, R., Faugeras, O., Luong, Q.: A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence 78, 87–119 (1995)CrossRefGoogle Scholar
  12. 12.
    Lhuillier, M., Quan, L.: Robust Dense Matching Using Local and Global Geometric Constraints. In: Proc. of ICPR 2000, vol. 1, pp. 968–972 (2000)Google Scholar
  13. 13.
    Messer, K., Matas, J., Kittler, J., Luettin, J., Maitre, G.: XM2VTSDB: The Extended M2VTS Database. In: Second International Conference on Audio and Video-based Biometric Person Authentication, March 1999, pp. 964–966 (1999)Google Scholar
  14. 14.
    Samaria, F., Harter, A.C.: Parameterisation of a Stochastic Model for Human Face Identification. In: Proc. of the 2nd IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994)Google Scholar
  15. 15.
    Ben-Arie, J., Nandy, D.: A volumetric/iconic frequency domain representation for objects with application for pose invariant face recognition. IEEE Trans. on PAMI 20, 449–457 (1998)Google Scholar
  16. 16.
    Aiello, D., Cerrato, L., Delogu, C., Di Carlo, A.: The acquisition of a speech corpus for limited domain translation. In: Proceedings of the European Conference on Speech Communication and Technology, Budapest (1999)Google Scholar
  17. 17.
    EuTrans. Example-based language translation systems. Final report. Technical report, Instituto Tecnológico de Informática, Fondazione Ugo Bordoni, Rheinisch Westfälische Technische Hochschule Aachen Lehrstuhl für Informatik VI, Zeres GmbH Bochum: Long Term Research Domain, Project Number 30268 (2000)Google Scholar
  18. 18.
    Falcone, M.: Gallo The SIVA speech database for speaker verification: description and evaluation. In: ICSLP 1996, Philadelphia, USA, October 1996, pp. 1902–1905 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Roberto Paredes
    • 1
  • Enrique Vidal
    • 1
  • Francisco Casacuberta
    • 1
  1. 1.Instituto Tecnológico de Informática, Departamento de Sistemas Informáticos y ComputaciónUniversidad Politécnica de ValenciaValenciaSpain

Personalised recommendations