Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion
The paper introduces a new framework for feature learning in classification motivated by information theory. We first systematically study the information structure and present a novel perspective revealing the two key factors in information utilization: class-relevance and redundancy. We derive a new information decomposition model where a novel concept called class-relevant redundancy is introduced. Subsequently a new algorithm called Conditional Informative Feature Extraction is formulated, which maximizes the joint class-relevant information by explicitly reducing the class-relevant redundancies among features. To address the computational difficulties in information-based optimization, we incorporate Parzen window estimation into the discrete approximation of the objective function and propose a Local Active Region method which substantially increases the optimization efficiency. To effectively utilize the extracted feature set, we propose a Bayesian MAP formulation for feature fusion, which unifies Laplacian Sparse Prior and Multivariate Logistic Regression to learn a fusion rule with good generalization capability. Realizing the inefficiency caused by separate treatment of the extraction stage and the fusion stage, we further develop an improved design of the framework to coordinate the two stages by introducing a feedback from the fusion stage to the extraction stage, which significantly enhances the learning efficiency. The results of the comparative experiments show remarkable improvements achieved by our framework.
KeywordsFeature Selection Feature Extraction Mutual Information Face Recognition Extraction Stage
Unable to display preview. Download preview PDF.
- 8.Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: ICML 1997 (1997)Google Scholar
- 10.Vasconcelos, N.: Feature Selection by Maximum Marginal Diversity. In: NIPS 2002 (2002)Google Scholar
- 11.Wu, Y., Zhang, A.: Feature Selection for Classifying High-Dimensional Numerical Data. In: CVPR 2004 (2004)Google Scholar
- 14.Vasconcelos, N., Vasconcelos, M.: Scalable Discriminant Feature Selection for Image Retrieval and Recognition. In: CVPR 2004 (2004)Google Scholar
- 15.Torkkola, K., Campbell, W.M.: Mutual Information in Learning Feature Transformations. In: ICML 2000 (2000)Google Scholar
- 16.Torkkola, K.: Feature Extraction by Non-Parametric Mutual Information Maximization. J. Machine Learning Research, 1415–1438 (2003)Google Scholar
- 21.Messer, K., Matas, J., Kittler, J., Luettin, J., Maitre, G.: XM2VTSDB: The Extended M2VTS Database. In: Proc. of Int.l Conf. Audio- and Video-based Person Authentication (1999)Google Scholar
- 22.Martinez, A.M., Benavente, R.: The AR Face Database. CVC Technical Report 24, Purdue University (1998)Google Scholar