Feature Selection for Complex Patterns

  • Peter Schenkel
  • Wanqing Li
  • Wanquan Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)


Feature selection is an important data preprocessing step in data mining and pattern recognition. Many algorithms have been proposed in the past for simple patterns that can be characterised by a single feature vector. Unfortunately, these algorithms are hardly applicable to what are referred as complex patterns that have to be described by a finite set of feature vectors. This paper addresses the problem of feature selection for the complex patterns. First, we formulated the calculation of mutual information for complex patterns based on Gaussian mixture model. A hybrid feature selection algorithm is then proposed based on the formulated mutual information calculation (filter) and Baysian classification (wrapper). Experimental results on XM2VTS speaker recognition database have not only verified the performance of the proposed algorithm, but also demonstrated that traditional feature selection algorithms designed for simple patterns would perform poorly for complex patterns.


Feature Vector Feature Selection Mutual Information Complex Pattern Gaussian Mixture Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. on Neural Networks 5(4), 537–550 (1994)CrossRefGoogle Scholar
  2. 2.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, Chichester (1991)CrossRefMATHGoogle Scholar
  3. 3.
    Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1, 131–156 (1997)CrossRefGoogle Scholar
  4. 4.
    Eriksson, T., Kim, S., Kang, H.-G., Lee, C.: An information theoretic perspective on feature selection in speaker recognition. IEEE Signal Processing Letters 12(7), 500–503 (2005)CrossRefGoogle Scholar
  5. 5.
    Alzadeh, A.A., et al.: Distinct types of diffuse large b-cell lymphoma identified by gen expression profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  6. 6.
    Fraser, A.M., Swinney, H.L.: Independent coordinates for strange attractors from mutual information. Phical Review 33(2), 1134–1140 (1986)CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)MATHGoogle Scholar
  8. 8.
    Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing. Prentice-Hall, Englewood Cliffs (2001)Google Scholar
  9. 9.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 23–34 (1997)CrossRefGoogle Scholar
  10. 10.
    Kwak, N., Choi, C.-H.: Input feature selection for classification problems. IEEE Trans. on Neural Networks 13(1), 143–159 (2002)CrossRefGoogle Scholar
  11. 11.
    Liu, H.: Evolving feature selection. IEEE Intelligent Systems, 64–76 (November/December 2005)Google Scholar
  12. 12.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. on Knowledge and Data Engineering 17(4), 491–502 (2005)CrossRefGoogle Scholar
  13. 13.
    Luttin, J.: Evaluation protocol for the xm2fdb database (lausanne protocol). Technical Report Communication 98-05, IDIAP, Martigny, Switzerlan (1998)Google Scholar
  14. 14.
    Kwak, N., Choi, C.-H.: Input feature selection by nutual information based on parzen window. IEEE Trans. Pattern Analysis and Machine Intelligence 24(12), 1667–1671 (2002)CrossRefGoogle Scholar
  15. 15.
    Nilsson, M., Gusafsson, H., Andersen, S.V., Kleijn, W.B.: Gaussian mixture model based mutual information estimation between frequency bands in speech. In: ICASSP, vol. 1, pp. 525–528 (2002)Google Scholar
  16. 16.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)CrossRefGoogle Scholar
  17. 17.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted guassian mixtue models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  18. 18.
    Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Diversity in search strategies for ensemble feature selection. Information fusion 6, 83–98 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Peter Schenkel
    • 1
  • Wanqing Li
    • 2
  • Wanquan Liu
    • 3
  1. 1.University of KarlsruheGermany
  2. 2.University of WollongongAustralia
  3. 3.Curtin University of TechnologyAustralia

Personalised recommendations