Skip to main content
Log in

Real-Time Speaker Verification System Implemented on Reconfigurable Hardware

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Nowadays, biometrics is considered as a promising solution in the market of security and personal verification. Applications such as financial transactions, law enforcement or network management security are already benefitting from this technology. Among the different biometric modalities, speaker verification represents an accurate and efficient way of authenticating a person’s identity by analyzing his/her voice. This identification method is especially suitable in real-life scenarios or when a remote recognition over the phone is required. The processing of a signal of voice, in order to extract its unique features, that allows distinguishing an individual to confirm or deny his/her identity is, usually, a process characterized by a high computational cost. This complexity imposes that many systems, based on microprocessor clocked at hundreds of MHz, are unable to process samples of voice in real-time. This drawback has an important effect, since in general, the response time needed by the biometric system affects its acceptability by users. The design based on FPGA (Field Programmable Gate Arrays) is a suited way to implement systems that require a high computational capability and the resolution of algorithms in real-time. Besides, these devices allow the design of complex digital systems with outstanding performance in terms of execution time. This paper presents the implementation of a MFCC (Mel-Frequency Cepstrum Coefficients)—SVM (Support Vector Machine) speaker verification system based on a low-cost FPGA. Experimental results show that our system is able to verify a person’s identity as fast as a high-performance microprocessor based on a Pentium IV personal computer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14

Similar content being viewed by others

References

  1. Pollack, I., Pickett, J. M., & Sumby, W. (1954). On the identification of speakers by voice. Journal of the Acoustical Society of America, 26, 403–406.

    Article  Google Scholar 

  2. Shearme, J. N., & Holmes, J. N. (1959). An experiment concerning the recognition of voices. Language and Speech, 2, 123–131.

    Google Scholar 

  3. Rabiner, L., & Biing-Hwang, J. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  4. Picone, J. W. (1993). Signal modeling techniques in speech recognition. Proceedings of the IEEE, 81(9), 1215–1247.

    Article  Google Scholar 

  5. Davis, S. B. & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics Speech, and Signal Processing, vol. ASSP-28, No 4.

  6. Lei, J., & Bo, Xu. (2002). Including detailed information feature in MFCC for large vocabulary continuous speech recognition. Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP’02). IEEE International Conference on, 1, I805–I808.

    Google Scholar 

  7. Childers, D. G., & Skinner, D. P. (October 1977). The Cepstrum: A Guide to Processing. Proceedings of the IEEE, 65(10), 1428–1443.

    Article  Google Scholar 

  8. Noll, A. M. (1967). Cesptrum pitch determination. The Journal of the Acoustical Society of America, 41(2), 293–309.

    Article  MathSciNet  Google Scholar 

  9. Munteanu, D.-P. & Toma, S.-A. (2010). Automatic speaker verification experiments using HMM, 8th International Conference on Communications (COMM), pp. 107–110.

  10. Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M., & Gupta, C. S. (2005). Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13(4), 575–582.

    Article  Google Scholar 

  11. Kinnunen, T., Karpov, E., & Franti, P. (2006). Real-time speaker identification and verification. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 277–288.

    Article  Google Scholar 

  12. Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3(1), 72–73.

    Article  Google Scholar 

  13. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Kluwer Academic Publishers, Data Mining and Knowledge Discovery, 2, 121–167.

    Article  Google Scholar 

  14. Wan, V., & Campbell, W. M. (2000). Support vector machines for speaker verification and identification. Proceedings of the 2000 IEEE Signal Processing Society Workshop Neural Networks for Signal Processing X, 2, 775–784.

    Article  Google Scholar 

  15. Wu, G.-D. & Zhu, Z.-W. (2007). Chip design of LPC-cepstrum for speech recognition, 6th IEEE/ACIS International Conference on Computer and Information Science, 2007. ICIS 2007, pp. 43–47.

  16. Fons, F., Fons M., Cantó, E. (2010). Fingerprint image processing acceleration through run-time reconfiguration hardware, IEEE Transactions on Circuits and Systems II, 57(12).

  17. López, M., Daugman, J., & Cantó, E. (April 2011). Hardware-software Co-design of an iris recognition algorithm. IET Information Security, 5(1), 60–68.

    Article  Google Scholar 

  18. Choi, W.-Y., Ahn, D., Burn Pan, S., Chung, K., Chung, Y., & Chung, S.-H. (2006). SVM-based speaker verification system for match-on-card and its hardware implementation. ETRI Journal, 28(3), 320–328.

    Article  Google Scholar 

  19. Nedevschi, S., Patra, R. K., & Brewer, E. A. (2005). Hardware speech recognition for user interfaces in low cost, low power devices. Proceedings 42nd Design Automation Conference, 2005, 684–689.

    Google Scholar 

  20. Manikandan, J., Venkataramani, B., & Avanthi, V. (2009). FPGA implementation of support vector machine based isolated digit recognition system. 22nd International Conference on VLSI Design, 2009, 347–352.

    Google Scholar 

  21. Vu, N.-V., Whittington, J., Ye, H., Devlin, J. (2010). Implementation of the MFCC front-end for low-cost speech recognition systems, Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2334–2337.

  22. EhKan, P., Allen, T., Quigley, S. F. (2011). FPGA Implementation for GMM-based speaker identification, International Journal of Reconfigurable Computing, Volume 2011.

  23. Ercegovac, M. D., Digital aritmetic, Ed. Morgan Kaufmann.

  24. http://www.csie.ntu.edu.tw/~cjlin/libsvm/

  25. http://www.torch.ch/introduction.php

  26. Bengio, S., Bimbot, F., Hamouz, M., Mariethoz, J., Matas, J., Messer, K., Poree, F., Ruiz, B. (2003). The BANCA database and evaluation protocol, Lecture Notes in Computer Science Volume: 2688, Springer, pp. 625–638.

  27. Fierrez, J., Ortega-Garcia, J., et al. (2007). Biosec baseline corpus: a multimodal biometric database. Pattern Recognition, 40, 1389–1392.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael Ramos-Lara.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramos-Lara, R., López-García, M., Cantó-Navarro, E. et al. Real-Time Speaker Verification System Implemented on Reconfigurable Hardware. J Sign Process Syst 71, 89–103 (2013). https://doi.org/10.1007/s11265-012-0683-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-012-0683-5

Keywords

Navigation