Selection of the Best Wavelet Packet Nodes Based on Mutual Information for Speaker Identification

  • Rafael Fernández
  • Ana Montalvo
  • José R. Calvo
  • Gabriel Hernández
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5197)


The analysis of the speech signal using wavelet packet trees (WPT) is a very flexible tool, capable of effectively manipulate the frequency subbands thanks to the orthonormal bases it provides. Here, dimension reduction becomes very important since the number of subbands grows exponentially with the level of decomposition, and their discriminative relevancy is different, which leads to different resolution for each one. A method based on mutual information is proposed in order to keep as much discriminative information as possible and the less amount of redundant information.


WPT mutual information feature selection speaker identification 


  1. 1.
    Campbell Jr., J.P.: Speaker recognition: a tutorial. Proceedings of the IEEE 85(9), 1437–1462 (1997)CrossRefGoogle Scholar
  2. 2.
    Kinnunen, T.: Spectral Features for Automatic Text-Independent Speaker Recognition. Licentiate’s thesis, University of Joensuu, Department of Computer Science, Joensuu, Finland (2004)Google Scholar
  3. 3.
    Sarikaya, R., Hansen, H.L.: High resolution speech feature parametrization for monophone-based stressed speech recognition. IEEE Signal Processing Letters 7(7), 182–185 (2000)CrossRefGoogle Scholar
  4. 4.
    Farooq, O., Datta, S.: Mel-scaled wavelet filter based features for noisy unvoiced phoneme recognition. In: International Conference on Spoken Language Processing ICSLP, pp. 1017–1020 (2002)Google Scholar
  5. 5.
    Goswami, J.C., Chan, A.K.: Fundamentals of Wavelets: Theory, Algorithms, and Applications. John Wiley & Sons, Chichester (1999)zbMATHGoogle Scholar
  6. 6.
    Mallat, S.: A wavelet tour of signal processing. Academic Press, San Diego (1998)zbMATHGoogle Scholar
  7. 7.
    Battle, G.: A block spin construction of ondelettes. Part I: Lemarié functions. Comm. Math. Phys. 110, 601–615 (1987)MathSciNetGoogle Scholar
  8. 8.
    Lemarié, P.G.: Ondelettes à localisation exponentielle. J. Math. Pures Appl. 67, 227–236 (1988)MathSciNetGoogle Scholar
  9. 9.
    Siafarikas, M., Ganchev, T., Fakotakis, N.: Wavelet Packet Based Speaker Verification. In: Ortega-Garcia, J., et al. (eds.) The Speaker and Language Recognition Workshop ODYSSEY (2004)Google Scholar
  10. 10.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley-Interscience, Chichester (1991)CrossRefzbMATHGoogle Scholar
  11. 11.
    Peng, H.C., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of Maxdependency, Max-relevance and Min-redundancy. IEEE Trans. On Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)CrossRefGoogle Scholar
  12. 12.
    Lu, X., Dang, J.: Dimension reduction for speaker identification based on mutual information. In: Interspeech, pp. 2021–2024 (2007)Google Scholar
  13. 13.
    Ortega-Garcia, J., Gonzalez-Rodriguez, J., Marrero-Aguiar, V.: AHUMADA: A large speech corpus in Spanish for speaker characterization and identification. Speech Comm. 31, 255–264 (2000)CrossRefGoogle Scholar
  14. 14.
    Fletcher, H.: Auditory patterns. Reviews of Modern Physics 12, 47–65 (1940)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Rafael Fernández
    • 1
  • Ana Montalvo
    • 1
  • José R. Calvo
    • 1
  • Gabriel Hernández
    • 1
  1. 1.Advanced Technologies Application CenterHavanaCuba

Personalised recommendations