Abstract
Speaker identification is the process of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract and these can be exploited by extracting feature vectors such as Mel frequency cepstral coefficients (MFCCs) from the speech signal. The Gaussian Mixture Model (GMM) as a well-known statistical model then models the distribution of each speaker’s MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the classification hardware implementation of a text-independent GMM-based speaker identification system. A speed factor of 90 was achieved compared to software-based implementation on a standard PC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Reynolds, D., Rose, R.: Robust Text-independent Speaker Identification using Gaussian Mixture Speaker Models. IEEE Trans. on Speech and Audio Processing 3(1), 72–83 (1995)
Melnikoff, S., Quigley, S., Russell, M.: Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models. In: International Workshop on Field-Programmable Logic, pp. 202–211 (2002)
Melnikoff, S., Quigley, S., Russell, M.: Implementing a Simple Continuous Speech Recognition System on an FPGA. In: IEEE Symposium on Field Programmable Custom Computing Machines, Los Alamitos, pp. 275–276 (2002)
Miura, K., Noguchi, K., Kawaguchi, H., Yoshimoto, M.: A Low Memory Bandwidth Gaussian Mixture Model (GMM) Processor for 20,000-Word Real-Time Speech Recognition FPGA System. In: International Conference on Field Programmable Technology (2008)
Yoshizawa, S., Wada, N., Hayasaka, N., Miyanaga, Y.: Scalable Architecture for Word HMM-Based Speech Recognition and VLSI Implementation in Complete System. IEEE Trans. on Circuits and Systems, 70–77 (2006)
Lin, E., Yu, K., Rutenbar, R., Chen, T.: A 1000- Word Vocabulary, Speaker-Independent, Continuous Live- Mode Speech Recognizer Implemented in a Single FPGA. In: International Symposium on Field-Programmable Gate Arrays (FPGA), pp. 60–68 (2007)
Ramos-Lara, R., López-García, M., Cantó-Navarro, E., Puente-Rodriguez, L.: SVM Speaker Verification System based on a Low-Cost FPGA. In: Field-Programmable Logic and its Applications, pp. 202–211 (2009)
Holmes, J.N., Holmes, W.J.: Speech Synthesis and Recognition, 2nd edn. Taylor & Francis, London (2001)
Melnikoff, S., Quigley, S.F.: Implementing the Log-add Algorithm in Hardware. Electronic Letters (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kan, P.L.E., Allen, T., Quigley, S.F. (2010). A GMM-Based Speaker Identification System on FPGA. In: Sirisuk, P., Morgan, F., El-Ghazawi, T., Amano, H. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2010. Lecture Notes in Computer Science, vol 5992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12133-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-12133-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12132-6
Online ISBN: 978-3-642-12133-3
eBook Packages: Computer ScienceComputer Science (R0)