Extracting Continuous Relevant Features
The problem of extracting the relevant aspects of data, in face of multiple conflicting structures, is inherent to modeling of complex data. Extracting continuous structures in one random variable that are relevant for another variable has been principally addressed recently via the method of Sufficient dimensionality reduction. However, such auxiliary variables often contain both structures that are relevant and others that are irrelevant for the task in hand. Identifying the relevant structures was shown in the context of clustering to be considerably improved by minimizing the information about another, irrelevant, variable. In this paper we address the problem of extracting continuous relevant structures and derive its formal, as well as algorithmic, solution. Its operation is demonstrated in a synthetic example and in a real world application of face images, showing its superiority over current methods such as oriented principal component analysis.
KeywordsFace Recognition Face Image Side Information Neural Information Processing System Continuous Structure
Unable to display preview. Download preview PDF.
- CHECHIK, G. and TISHBY, N. (2003): Extracting relevant structures with side information. In: S. Becker, S. Thrun, and K. Obermayer (Eds.): Advances in Neural Information Processing Systems 15. MIT press, Cambridge, MA.Google Scholar
- COVER, T.M. and THOMAS, J.A. (1991): The elements of information theory. Plenum Press, New York.Google Scholar
- LEBANON, G. and LAFERRTY, J. (2002): Boosting and maximum likelihood for exponential models. In: T.G. Dietterich, S. Becker, and Z. Ghahramani (Eds.): Advances in Neural Information Processing Systems 14. MIT Press, Cambridge, MA.Google Scholar
- MALOUF, R. (2002): A comparison of algorithms for maximum entropy parameter estimation. In: Sixth Conf. on Natural Language Learning, 49–55.Google Scholar
- MARTINEZ, A.M. and BENAVENTE, R. (1998): The AR face data base. Technical Report 24, Computer vision Center.Google Scholar
- MIKA, S., Ratsch, G., WESTON, J., SCHOLKOPF, B., SMOLA, A, and MULLER, K. (2000): Invariant feature extraction and classification in kernel space. In: S.A. Solla, T.K. Leen, and K.R. Muller (Eds.): Advances in Neural Information Processing Systems 12. MIT Press, Cambridge, MA, 526–532.Google Scholar
- WEINSHALL, D., SHENTAL, N., HERTZ, T., and PAVEL, M. (2002): Adjustment learning and relevant component analysis. In: 7th European Conference of Computer Vision (CECCV 2002), Volume IV, Lecture Notes on Computer Sciences, 776–792.Google Scholar
- XING, E.P., NG, A.Y., Jordan, M.I., and RUSSELL, S. (2003): Distance metric learning, with applications to clusterin with side information. In: S. Becker, S. Thrun, and K. Obermayer (Eds.): Advances in Neural Information Processing Systems 15. MIT Press, Cambridge, MA.Google Scholar