Journal of Signal Processing Systems

, Volume 71, Issue 2, pp 89–103

Real-Time Speaker Verification System Implemented on Reconfigurable Hardware

  • Rafael Ramos-Lara
  • Mariano López-García
  • Enrique Cantó-Navarro
  • Luís Puente-Rodriguez


Nowadays, biometrics is considered as a promising solution in the market of security and personal verification. Applications such as financial transactions, law enforcement or network management security are already benefitting from this technology. Among the different biometric modalities, speaker verification represents an accurate and efficient way of authenticating a person’s identity by analyzing his/her voice. This identification method is especially suitable in real-life scenarios or when a remote recognition over the phone is required. The processing of a signal of voice, in order to extract its unique features, that allows distinguishing an individual to confirm or deny his/her identity is, usually, a process characterized by a high computational cost. This complexity imposes that many systems, based on microprocessor clocked at hundreds of MHz, are unable to process samples of voice in real-time. This drawback has an important effect, since in general, the response time needed by the biometric system affects its acceptability by users. The design based on FPGA (Field Programmable Gate Arrays) is a suited way to implement systems that require a high computational capability and the resolution of algorithms in real-time. Besides, these devices allow the design of complex digital systems with outstanding performance in terms of execution time. This paper presents the implementation of a MFCC (Mel-Frequency Cepstrum Coefficients)—SVM (Support Vector Machine) speaker verification system based on a low-cost FPGA. Experimental results show that our system is able to verify a person’s identity as fast as a high-performance microprocessor based on a Pentium IV personal computer.


Biometrics Field programmable gate array (FPGA) Real-time systems Embedded systems Special-purpose hardware Speaker verification 


  1. 1.
    Pollack, I., Pickett, J. M., & Sumby, W. (1954). On the identification of speakers by voice. Journal of the Acoustical Society of America, 26, 403–406.CrossRefGoogle Scholar
  2. 2.
    Shearme, J. N., & Holmes, J. N. (1959). An experiment concerning the recognition of voices. Language and Speech, 2, 123–131.Google Scholar
  3. 3.
    Rabiner, L., & Biing-Hwang, J. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall.Google Scholar
  4. 4.
    Picone, J. W. (1993). Signal modeling techniques in speech recognition. Proceedings of the IEEE, 81(9), 1215–1247.CrossRefGoogle Scholar
  5. 5.
    Davis, S. B. & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics Speech, and Signal Processing, vol. ASSP-28, No 4.Google Scholar
  6. 6.
    Lei, J., & Bo, Xu. (2002). Including detailed information feature in MFCC for large vocabulary continuous speech recognition. Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP’02). IEEE International Conference on, 1, I805–I808.Google Scholar
  7. 7.
    Childers, D. G., & Skinner, D. P. (October 1977). The Cepstrum: A Guide to Processing. Proceedings of the IEEE, 65(10), 1428–1443.CrossRefGoogle Scholar
  8. 8.
    Noll, A. M. (1967). Cesptrum pitch determination. The Journal of the Acoustical Society of America, 41(2), 293–309.MathSciNetCrossRefGoogle Scholar
  9. 9.
    Munteanu, D.-P. & Toma, S.-A. (2010). Automatic speaker verification experiments using HMM, 8th International Conference on Communications (COMM), pp. 107–110.Google Scholar
  10. 10.
    Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M., & Gupta, C. S. (2005). Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13(4), 575–582.CrossRefGoogle Scholar
  11. 11.
    Kinnunen, T., Karpov, E., & Franti, P. (2006). Real-time speaker identification and verification. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 277–288.CrossRefGoogle Scholar
  12. 12.
    Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3(1), 72–73.CrossRefGoogle Scholar
  13. 13.
    Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Kluwer Academic Publishers, Data Mining and Knowledge Discovery, 2, 121–167.CrossRefGoogle Scholar
  14. 14.
    Wan, V., & Campbell, W. M. (2000). Support vector machines for speaker verification and identification. Proceedings of the 2000 IEEE Signal Processing Society Workshop Neural Networks for Signal Processing X, 2, 775–784.CrossRefGoogle Scholar
  15. 15.
    Wu, G.-D. & Zhu, Z.-W. (2007). Chip design of LPC-cepstrum for speech recognition, 6th IEEE/ACIS International Conference on Computer and Information Science, 2007. ICIS 2007, pp. 43–47.Google Scholar
  16. 16.
    Fons, F., Fons M., Cantó, E. (2010). Fingerprint image processing acceleration through run-time reconfiguration hardware, IEEE Transactions on Circuits and Systems II, 57(12).Google Scholar
  17. 17.
    López, M., Daugman, J., & Cantó, E. (April 2011). Hardware-software Co-design of an iris recognition algorithm. IET Information Security, 5(1), 60–68.CrossRefGoogle Scholar
  18. 18.
    Choi, W.-Y., Ahn, D., Burn Pan, S., Chung, K., Chung, Y., & Chung, S.-H. (2006). SVM-based speaker verification system for match-on-card and its hardware implementation. ETRI Journal, 28(3), 320–328.CrossRefGoogle Scholar
  19. 19.
    Nedevschi, S., Patra, R. K., & Brewer, E. A. (2005). Hardware speech recognition for user interfaces in low cost, low power devices. Proceedings 42nd Design Automation Conference, 2005, 684–689.Google Scholar
  20. 20.
    Manikandan, J., Venkataramani, B., & Avanthi, V. (2009). FPGA implementation of support vector machine based isolated digit recognition system. 22nd International Conference on VLSI Design, 2009, 347–352.Google Scholar
  21. 21.
    Vu, N.-V., Whittington, J., Ye, H., Devlin, J. (2010). Implementation of the MFCC front-end for low-cost speech recognition systems, Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2334–2337.Google Scholar
  22. 22.
    EhKan, P., Allen, T., Quigley, S. F. (2011). FPGA Implementation for GMM-based speaker identification, International Journal of Reconfigurable Computing, Volume 2011.Google Scholar
  23. 23.
    Ercegovac, M. D., Digital aritmetic, Ed. Morgan Kaufmann.Google Scholar
  24. 24.
  25. 25.
  26. 26.
    Bengio, S., Bimbot, F., Hamouz, M., Mariethoz, J., Matas, J., Messer, K., Poree, F., Ruiz, B. (2003). The BANCA database and evaluation protocol, Lecture Notes in Computer Science Volume: 2688, Springer, pp. 625–638.Google Scholar
  27. 27.
    Fierrez, J., Ortega-Garcia, J., et al. (2007). Biosec baseline corpus: a multimodal biometric database. Pattern Recognition, 40, 1389–1392.MATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Rafael Ramos-Lara
    • 1
  • Mariano López-García
    • 1
  • Enrique Cantó-Navarro
    • 2
  • Luís Puente-Rodriguez
    • 3
  1. 1.Technical University of CataloniaVilanova i la GeltrúSpain
  2. 2.Universidad Rovira i VirgiliSant Pere i Sant PauSpain
  3. 3.Universidad Carlos III de MadridLeganesSpain

Personalised recommendations