A GMM-Based Speaker Identification System on FPGA

Kan, Phak Len Eh; Allen, Tim; Quigley, Steven F.

doi:10.1007/978-3-642-12133-3_34

Phak Len Eh Kan²⁰,
Tim Allen²⁰ &
Steven F. Quigley²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5992))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

1540 Accesses
2 Citations

Abstract

Speaker identification is the process of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract and these can be exploited by extracting feature vectors such as Mel frequency cepstral coefficients (MFCCs) from the speech signal. The Gaussian Mixture Model (GMM) as a well-known statistical model then models the distribution of each speaker’s MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the classification hardware implementation of a text-independent GMM-based speaker identification system. A speed factor of 90 was achieved compared to software-based implementation on a standard PC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Reynolds, D., Rose, R.: Robust Text-independent Speaker Identification using Gaussian Mixture Speaker Models. IEEE Trans. on Speech and Audio Processing 3(1), 72–83 (1995)
Article Google Scholar
Melnikoff, S., Quigley, S., Russell, M.: Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models. In: International Workshop on Field-Programmable Logic, pp. 202–211 (2002)
Google Scholar
Melnikoff, S., Quigley, S., Russell, M.: Implementing a Simple Continuous Speech Recognition System on an FPGA. In: IEEE Symposium on Field Programmable Custom Computing Machines, Los Alamitos, pp. 275–276 (2002)
Google Scholar
Miura, K., Noguchi, K., Kawaguchi, H., Yoshimoto, M.: A Low Memory Bandwidth Gaussian Mixture Model (GMM) Processor for 20,000-Word Real-Time Speech Recognition FPGA System. In: International Conference on Field Programmable Technology (2008)
Google Scholar
Yoshizawa, S., Wada, N., Hayasaka, N., Miyanaga, Y.: Scalable Architecture for Word HMM-Based Speech Recognition and VLSI Implementation in Complete System. IEEE Trans. on Circuits and Systems, 70–77 (2006)
Google Scholar
Lin, E., Yu, K., Rutenbar, R., Chen, T.: A 1000- Word Vocabulary, Speaker-Independent, Continuous Live- Mode Speech Recognizer Implemented in a Single FPGA. In: International Symposium on Field-Programmable Gate Arrays (FPGA), pp. 60–68 (2007)
Google Scholar
Ramos-Lara, R., López-García, M., Cantó-Navarro, E., Puente-Rodriguez, L.: SVM Speaker Verification System based on a Low-Cost FPGA. In: Field-Programmable Logic and its Applications, pp. 202–211 (2009)
Google Scholar
Holmes, J.N., Holmes, W.J.: Speech Synthesis and Recognition, 2nd edn. Taylor & Francis, London (2001)
Google Scholar
Melnikoff, S., Quigley, S.F.: Implementing the Log-add Algorithm in Hardware. Electronic Letters (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom
Phak Len Eh Kan, Tim Allen & Steven F. Quigley

Authors

Phak Len Eh Kan
View author publications
You can also search for this author in PubMed Google Scholar
Tim Allen
View author publications
You can also search for this author in PubMed Google Scholar
Steven F. Quigley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronic Engineering, Mahanakorn University of Technology, 10530, Bangkok, Thailand
Phaophak Sirisuk
Department of Electronic Engineering, National University of Ireland, Galway, Ireland
Fearghal Morgan
Department of Electrical and Computer Engineering, The George Washington University, 20052, Washington, DC, USA
Tarek El-Ghazawi
Department of Information and Computer Science, Yokohama, Keio University, 223–8522, Kanagawa, Japan
Hideharu Amano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kan, P.L.E., Allen, T., Quigley, S.F. (2010). A GMM-Based Speaker Identification System on FPGA. In: Sirisuk, P., Morgan, F., El-Ghazawi, T., Amano, H. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2010. Lecture Notes in Computer Science, vol 5992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12133-3_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-12133-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12132-6
Online ISBN: 978-3-642-12133-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics