Real-Time Speaker Verification System Implemented on Reconfigurable Hardware
Purchase on Springer.com
$39.95 / €34.95 / £29.95*
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.
Nowadays, biometrics is considered as a promising solution in the market of security and personal verification. Applications such as financial transactions, law enforcement or network management security are already benefitting from this technology. Among the different biometric modalities, speaker verification represents an accurate and efficient way of authenticating a person’s identity by analyzing his/her voice. This identification method is especially suitable in real-life scenarios or when a remote recognition over the phone is required. The processing of a signal of voice, in order to extract its unique features, that allows distinguishing an individual to confirm or deny his/her identity is, usually, a process characterized by a high computational cost. This complexity imposes that many systems, based on microprocessor clocked at hundreds of MHz, are unable to process samples of voice in real-time. This drawback has an important effect, since in general, the response time needed by the biometric system affects its acceptability by users. The design based on FPGA (Field Programmable Gate Arrays) is a suited way to implement systems that require a high computational capability and the resolution of algorithms in real-time. Besides, these devices allow the design of complex digital systems with outstanding performance in terms of execution time. This paper presents the implementation of a MFCC (Mel-Frequency Cepstrum Coefficients)—SVM (Support Vector Machine) speaker verification system based on a low-cost FPGA. Experimental results show that our system is able to verify a person’s identity as fast as a high-performance microprocessor based on a Pentium IV personal computer.
- Pollack, I., Pickett, J. M., & Sumby, W. (1954). On the identification of speakers by voice. Journal of the Acoustical Society of America, 26, 403–406. CrossRef
- Shearme, J. N., & Holmes, J. N. (1959). An experiment concerning the recognition of voices. Language and Speech, 2, 123–131.
- Rabiner, L., & Biing-Hwang, J. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall.
- Picone, J. W. (1993). Signal modeling techniques in speech recognition. Proceedings of the IEEE, 81(9), 1215–1247. CrossRef
- Davis, S. B. & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics Speech, and Signal Processing, vol. ASSP-28, No 4.
- Lei, J., & Bo, Xu. (2002). Including detailed information feature in MFCC for large vocabulary continuous speech recognition. Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP’02). IEEE International Conference on, 1, I805–I808.
- Childers, D. G., & Skinner, D. P. (October 1977). The Cepstrum: A Guide to Processing. Proceedings of the IEEE, 65(10), 1428–1443. CrossRef
- Noll, A. M. (1967). Cesptrum pitch determination. The Journal of the Acoustical Society of America, 41(2), 293–309. CrossRef
- Munteanu, D.-P. & Toma, S.-A. (2010). Automatic speaker verification experiments using HMM, 8th International Conference on Communications (COMM), pp. 107–110.
- Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M., & Gupta, C. S. (2005). Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13(4), 575–582. CrossRef
- Kinnunen, T., Karpov, E., & Franti, P. (2006). Real-time speaker identification and verification. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 277–288. CrossRef
- Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3(1), 72–73. CrossRef
- Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Kluwer Academic Publishers, Data Mining and Knowledge Discovery, 2, 121–167. CrossRef
- Wan, V., & Campbell, W. M. (2000). Support vector machines for speaker verification and identification. Proceedings of the 2000 IEEE Signal Processing Society Workshop Neural Networks for Signal Processing X, 2, 775–784. CrossRef
- Wu, G.-D. & Zhu, Z.-W. (2007). Chip design of LPC-cepstrum for speech recognition, 6th IEEE/ACIS International Conference on Computer and Information Science, 2007. ICIS 2007, pp. 43–47.
- Fons, F., Fons M., Cantó, E. (2010). Fingerprint image processing acceleration through run-time reconfiguration hardware, IEEE Transactions on Circuits and Systems II, 57(12).
- López, M., Daugman, J., & Cantó, E. (April 2011). Hardware-software Co-design of an iris recognition algorithm. IET Information Security, 5(1), 60–68. CrossRef
- Choi, W.-Y., Ahn, D., Burn Pan, S., Chung, K., Chung, Y., & Chung, S.-H. (2006). SVM-based speaker verification system for match-on-card and its hardware implementation. ETRI Journal, 28(3), 320–328. CrossRef
- Nedevschi, S., Patra, R. K., & Brewer, E. A. (2005). Hardware speech recognition for user interfaces in low cost, low power devices. Proceedings 42nd Design Automation Conference, 2005, 684–689.
- Manikandan, J., Venkataramani, B., & Avanthi, V. (2009). FPGA implementation of support vector machine based isolated digit recognition system. 22nd International Conference on VLSI Design, 2009, 347–352.
- Vu, N.-V., Whittington, J., Ye, H., Devlin, J. (2010). Implementation of the MFCC front-end for low-cost speech recognition systems, Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2334–2337.
- EhKan, P., Allen, T., Quigley, S. F. (2011). FPGA Implementation for GMM-based speaker identification, International Journal of Reconfigurable Computing, Volume 2011.
- Ercegovac, M. D., Digital aritmetic, Ed. Morgan Kaufmann.
- Bengio, S., Bimbot, F., Hamouz, M., Mariethoz, J., Matas, J., Messer, K., Poree, F., Ruiz, B. (2003). The BANCA database and evaluation protocol, Lecture Notes in Computer Science Volume: 2688, Springer, pp. 625–638.
- Fierrez, J., Ortega-Garcia, J., et al. (2007). Biosec baseline corpus: a multimodal biometric database. Pattern Recognition, 40, 1389–1392. CrossRef
- Real-Time Speaker Verification System Implemented on Reconfigurable Hardware
Journal of Signal Processing Systems
Volume 71, Issue 2 , pp 89-103
- Cover Date
- Print ISSN
- Online ISSN
- Springer US
- Additional Links
- Field programmable gate array (FPGA)
- Real-time systems
- Embedded systems
- Special-purpose hardware
- Speaker verification
- Author Affiliations
- 1. Technical University of Catalonia, Av. Victor Balaguer s/n, Vilanova i la Geltrú, 08800, Spain
- 2. Universidad Rovira i Virgili, Campus Sescelades, Av. Països Catalans 26, Sant Pere i Sant Pau, 43007, Spain
- 3. Universidad Carlos III de Madrid, Avda. Universidad, 30, Leganes, 28911, Spain