Abstract
Most of the existing modelling techniques for the speaker recognition task make an implicit assumption of sufficient data for speaker modelling and hence may lead to poor modelling under limited data condition. The present work gives an experimental evaluation of the modelling techniques like Crisp Vector Quantization (CVQ), Fuzzy Vector Quantization (FVQ), Self-Organizing Map (SOM), Learning Vector Quantization (LVQ), and Gaussian Mixture Model (GMM) classifiers. An experimental evaluation of the most widely used Gaussian Mixture Model-Universal Background Model (GMM-UBM) is also made. The experimental knowledge is then used to select a subset of classifiers for obtaining the combined classifiers. It is proposed that the combined LVQ and GMM-UBM classifier provides relatively better performance compared to all the individual as well as combined classifiers.
Similar content being viewed by others
References
Angkititrakul P, Hansen J H L 2007 Discriminative In-Set/Out-of-Set speaker recognition. IEEE Trans. Audio Speech Language Process. 15(2): 498–508
Atal B S 1976 Automatic recognition of speakers from their voices, Proc. IEEE 64(4): 460–475
Bezdek J C, Harris J D 1978 Fuzzy portions and relations; an axiomatic basis for clustering. Fuzzy Sets and Systems 1: 111–127
Campbell Jr J P 1995 Testing with the YOHO CD-ROM voice verification corpus. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Detroit, Michigon 341–344
Deller J, Hansen J, Proakis J 1993 Discrete Time Processing of Speech Signals, 1st ed. IEEE Press
Gray R 1984 Vector quantization IEEE Acoust., Speech, Signal Process. Mag. 1: 4–29
Jayanna H S, Prasanna S R M 2006 Variable segmental analysis based speaker recognition in limited data condition. In Proc. IEEE-Int. Conf. Signal, Image Process vol. 2 Karnataka, India
Kittler J, Hatef M, Duin R P W, Matas J 1998 On combining classifiers. IEEE Trans. Patt. Anly. Machine Intelligence 20(3): 226–239
Kohonen T 1990 The self-organizing map. Proc. IEEE 78(9): 1464–1480
Kwon S, Narayanan S 2007 Robust speaker identification based on selective use of feature vectors. Patt. Recog. Lett. 28: 85–89
Prakash V, Hansen J H L 2007 In-Set/Out-of-Set speaker recognition under sparse enrollment. IEEE Trans. Audio Speech Language Process 15(7): 2044–2051
Prasanna S R M, Gupta C S, Yegnanarayana B 2006 Extraction of speaker-specific excitation information from linear prediction residual of speech. Speech Communication 48: 1243–1261
Rabiner L, Juang B H 1993 Fundamentals of Speech Recognition. (Singapore: Pearson Education)
Reynolds D A, Rose R C 1995 Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process 3(1): 72–83
Reynolds D A 1995 Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17: 91–108.
Reynolds D A, Quateri T F, Dunn R B 2000 Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10: 19–41
Zue S S V, Glass J 1990 Speech database development at MIT:TIMIT and beyond. Speech Communication 9: 351–356
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jayanna, H.S., Mahadeva Prasanna, S.R. An experimental comparison of modelling techniques for speaker recognition under limited data condition. Sadhana 34, 717–728 (2009). https://doi.org/10.1007/s12046-009-0042-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12046-009-0042-9