A Gaussian Mixture Based Maximization of Mutual Information for Supervised Feature Extraction
In this paper, we propose a new method for linear feature extraction and dimensionality reduction for classification problems. The method is based on the maximization of the Mutual Information (MI) between the resulting features and the classes. A Gaussian Mixture is used for modelling the distribution of the data. By means of this model, the entropy of the data is then estimated, and so the MI at the output. A gradient descent algorithm is provided for its optimization. Some experiments are provided in which the method is compared with other popular linear feature extractors.
KeywordsFeature Extraction Mutual Information Partial Little Square Independent Component Analysis Gaussian Mixture Model
Unable to display preview. Download preview PDF.
- 1.UCI Repository of Machine Learning Databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 2.CBCL Software and Datasets, MIT, Face Images database (2000), http://www.ai.mit.edu/projects/cbcl/software-datasets/index.html
- 5.Center, J.L.: Blind source separation, independent component analysis, and pattern classification - connections and synergies. In: Proceedings MaxEnt 23, Jackson Hole, WY (2003)Google Scholar
- 8.Xu, D., Principe, J., Fischer III., J.W.: Information-Theoretic Learning, vol. 1. Wiley, Chichester (2000)Google Scholar
- 9.Kaski, S., Peltonen, J.: Informative discriminant analysis. In: Proceeding of the ICML, Washington DC, vol. 5, pp. 329–336 (2003)Google Scholar
- 11.Pereira, F.C., Tishby, N., Bialek, W.: The information bottleneck method. In: 37th Annual Allerton International Conference on Communications, Control and Computing (1999)Google Scholar