Wavelet Packet Based Mel Frequency Cepstral Features for Text Independent Speaker Identification

  • Smriti Srivastava
  • Saurabh Bhardwaj
  • Abhishek Bhandari
  • Krit Gupta
  • Hitesh Bahl
  • J. R. P. Gupta
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)

Abstract

The present research proposes a paradigm which combines the Wavelet Packet Transform (WPT) with the distinguished Mel Frequency Cepstral Coefficients (MFCC) for extraction of speech feature vectors in the task of text independent speaker identification. The proposed technique overcomes the single resolution limitation of MFCC by incorporating the multi resolution analysis offered by WPT. To check the accuracy of the proposed paradigm in the real life scenario, it is tested on the speaker database by using Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM) as classifiers and their relative performance for identification purpose is compared. The identification results of the MFCC features and the Wavelet Packet based Mel Frequency Cepstral (WP-MFC) Features are compared to validate the efficiency of the proposed paradigm. Accuracy as high as 100% was achieved in some cases using WP-MFC Features.

Keywords

WPT MFCC HMM GMM Speaker Identification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Reynolds, D.A.: Speaker Identification and Verification Using Gaussian Mixture Speaker Models. Speech Communication 17 (1995)Google Scholar
  2. 2.
    Bolt Richard, H., Cooper Franklin, S., David Edward Jr., E., Denes Peter, B., Pickett James, M., Stevens Kenneth, N.: Speaker Identification by Speech Spectograms: A Scientists’ View of its Reliability for Legal Purposes. The Acoustic Society of America 47 (1970)Google Scholar
  3. 3.
    Reynolds Douglas, A.: Identification, Experimental Evaluation of Features for Robust Speaker. IEEE Transactions on Speech and Audio Processing 77, 257–285 (1994)Google Scholar
  4. 4.
    Gaikwad Santosh, K., Gawali Bharti, W., Pravin, Y.: A Review on Speech Recognition Technique. International Journal of Computer Applications 10 (2010)Google Scholar
  5. 5.
    Sirko, M., Michael, P., Ralf, S., Hermann, N.: Computing Mel-frequency coefficients on Power Spectrum. IEEE Proceedings of IEEE 1, 73–76 (2001)Google Scholar
  6. 6.
    Chen, S.-H., Luo, Y.-R.: Speaker Verification Using MFCC and Support. In: Proceedings of the International MultiConference of Engineers and Computer Scientists (2009)Google Scholar
  7. 7.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition, pp. 257–286 (1989)Google Scholar
  8. 8.
    Blimes, J.A.: A gentle tutorial of the EM algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. International Computer Science Institute (1998)Google Scholar
  9. 9.
    Reynolds, D.A., Campbell, W.M.: Springer Handbook of Speech Processing. Text Independent Speaker Recognition. Springer (2008)Google Scholar
  10. 10.
    Mallat, S.G.: A theory for multiresolution signal decomposition: the wavelet representation. IEEE 111, 674–693 (1989)Google Scholar
  11. 11.
    Robi, P.: The Engineers Ultimate Guide to Wavelet Analysis (2012), http://users.rowan.edu/~polikar/wavelets/wttutorial.html (accessed March 20, 2012)
  12. 12.
    VoxForge (2012), http://www.voxforge.org/home/downloads/speech/english (accessed February 20, 2012)

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Smriti Srivastava
    • 1
  • Saurabh Bhardwaj
    • 1
  • Abhishek Bhandari
    • 1
  • Krit Gupta
    • 1
  • Hitesh Bahl
    • 1
  • J. R. P. Gupta
    • 1
  1. 1.Netaji Subhas Institute of TechnologyNew DelhiIndia

Personalised recommendations