Unsupervised Extraction and Supervised Selection of Features Based on Information Gain
For robust recognition we first extract features from sensory data without considering the class labels, and then select important features for the classification. The unsupervised feature extraction may incorporate Principle Component Analysis, Independent Component Analysis, and Non-negative Matrix factorization. For the supervised selection of features we adopt Fisher Score and Information Gain (IG). To avoid the calculation of multivariate joint probability density functions, instead of the IG, we use Mutual Information (MI) between a feature and the class variable. However, in this case the MI among selected features reduces the effectiveness of the feature selection, and the statistically-independent ICA-based features result in the best performance.
KeywordsFeature extraction feature selection Fisher score information gain mutual information independent component analysis
Unable to display preview. Download preview PDF.
- 2.Bartlett, M., Lades, H., and Sejnowski, T.: Independant component representations for face recognition. In T. Rogowitz, B. and Pappas (Ed.): Proceedings of the SPIE Symposium on Electronic Imaging: Science and Technology; Human Vision and Electronic Imaging III, 3299, January 1998. SPIE Press, San Jose, CA, pp. 528–539.Google Scholar
- 4.Bishop, C.: Neural Networks for Pattern Recognition, 2nd ed. New York: Oxford University Press, (1995).Google Scholar
- 5.Wang, G., and Lochovsky, F.H.: Feature selection with conditional mutual information maxim in text categorization. Proceedings of the thirteenth ACM international conference on Information and Knowledge Management (2004) 342–349.Google Scholar
- 6.Lee, K.D., Lee, M.J., and Lee, S.Y.: Extraction of frame-difference features based on PCA and ICA for lip-reading. IEEE International Joint Conference on Neural Networks, IEEE Computer Society Press, Los Alamitos (2005) pp. 232–237.Google Scholar