Skip to main content
Log in

An experimental comparison of modelling techniques for speaker recognition under limited data condition

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

Most of the existing modelling techniques for the speaker recognition task make an implicit assumption of sufficient data for speaker modelling and hence may lead to poor modelling under limited data condition. The present work gives an experimental evaluation of the modelling techniques like Crisp Vector Quantization (CVQ), Fuzzy Vector Quantization (FVQ), Self-Organizing Map (SOM), Learning Vector Quantization (LVQ), and Gaussian Mixture Model (GMM) classifiers. An experimental evaluation of the most widely used Gaussian Mixture Model-Universal Background Model (GMM-UBM) is also made. The experimental knowledge is then used to select a subset of classifiers for obtaining the combined classifiers. It is proposed that the combined LVQ and GMM-UBM classifier provides relatively better performance compared to all the individual as well as combined classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Angkititrakul P, Hansen J H L 2007 Discriminative In-Set/Out-of-Set speaker recognition. IEEE Trans. Audio Speech Language Process. 15(2): 498–508

    Article  Google Scholar 

  • Atal B S 1976 Automatic recognition of speakers from their voices, Proc. IEEE 64(4): 460–475

    Article  Google Scholar 

  • Bezdek J C, Harris J D 1978 Fuzzy portions and relations; an axiomatic basis for clustering. Fuzzy Sets and Systems 1: 111–127

    Article  MATH  MathSciNet  Google Scholar 

  • Campbell Jr J P 1995 Testing with the YOHO CD-ROM voice verification corpus. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. Detroit, Michigon 341–344

  • Deller J, Hansen J, Proakis J 1993 Discrete Time Processing of Speech Signals, 1st ed. IEEE Press

  • Gray R 1984 Vector quantization IEEE Acoust., Speech, Signal Process. Mag. 1: 4–29

    Google Scholar 

  • Jayanna H S, Prasanna S R M 2006 Variable segmental analysis based speaker recognition in limited data condition. In Proc. IEEE-Int. Conf. Signal, Image Process vol. 2 Karnataka, India

  • Kittler J, Hatef M, Duin R P W, Matas J 1998 On combining classifiers. IEEE Trans. Patt. Anly. Machine Intelligence 20(3): 226–239

    Article  Google Scholar 

  • Kohonen T 1990 The self-organizing map. Proc. IEEE 78(9): 1464–1480

    Article  Google Scholar 

  • Kwon S, Narayanan S 2007 Robust speaker identification based on selective use of feature vectors. Patt. Recog. Lett. 28: 85–89

    Article  Google Scholar 

  • Prakash V, Hansen J H L 2007 In-Set/Out-of-Set speaker recognition under sparse enrollment. IEEE Trans. Audio Speech Language Process 15(7): 2044–2051

    Article  Google Scholar 

  • Prasanna S R M, Gupta C S, Yegnanarayana B 2006 Extraction of speaker-specific excitation information from linear prediction residual of speech. Speech Communication 48: 1243–1261

    Article  Google Scholar 

  • Rabiner L, Juang B H 1993 Fundamentals of Speech Recognition. (Singapore: Pearson Education)

    Google Scholar 

  • Reynolds D A, Rose R C 1995 Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process 3(1): 72–83

    Article  Google Scholar 

  • Reynolds D A 1995 Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17: 91–108.

    Article  Google Scholar 

  • Reynolds D A, Quateri T F, Dunn R B 2000 Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10: 19–41

    Article  Google Scholar 

  • Zue S S V, Glass J 1990 Speech database development at MIT:TIMIT and beyond. Speech Communication 9: 351–356

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. S. Jayanna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jayanna, H.S., Mahadeva Prasanna, S.R. An experimental comparison of modelling techniques for speaker recognition under limited data condition. Sadhana 34, 717–728 (2009). https://doi.org/10.1007/s12046-009-0042-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12046-009-0042-9

Keywords

Navigation