Abstract
We address the problem of selecting a subset of the most relevant features from a set of sample data in cases where there are multiple (equally reasonable) solutions. In particular, this topic includes on one hand the introduction of hand-crafted kernels which emphasize certain desirable aspects of the data and, on the other hand, the suppression of one of the solutions given “side” data, i.e., when one is given information about undesired aspects of the data. Such situations often arise when there are several, even conflicting, dimensions to the data. For example, documents can be clustered based on topic, authorship or writing style; images of human faces can be clustered based on illumination conditions, facial expressions or by person identity, and so forth.
Starting from a spectral method for feature selection, known as Q − α, we introduce first a kernel version of the approach thereby adding the power of non-linearity to the underlying representations and the choice to emphasize certain kernel-dependent aspects of the data. As an alternative to the use of a kernel we introduce a principled manner for making use of auxiliary data within a spectral approach for handling situations where multiple subsets of relevant features exist in the data. The algorithm we will introduce allows for inhibition of relevant features of the auxiliary dataset and allows for creating a topological model of all relevant feature subsets in the dataset.
To evaluate the effectiveness of our approach we have conducted experiments both on real-images of human faces under varying illumination, facial expressions and person identity and on general machine learning tasks taken from the UC Irvine repository. The performance of our algorithm for selecting features with side information is generally superior to current methods we tested (PCA,OPCA,CPCA and SDR-SI).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chechik, G., Tishby, N.: Extracting relevant structures with side information. In: NIPS (2002)
Chung, F.R.K.: Spectral Graph Theory. AMS, Providence (1998)
Diamantaras, K.I., Kung, S.Y.: Principal Component Neural Networks: Theory and Applications. Wiley, NY (1996)
Jacobs, C.E., Finkelstein, A., Salesin, D.H.: Fast multiresolution image querying. In: SIGGRAPH (1995)
Globerson, A., Chechik, G., Tishby, N.: Sufficient dimensionality reduction with irrelevance statistics. In: UAI-2003 (2003)
Golub, G.H., Van Loan, C.F.: Matrix computations (1989)
Martinez, A.M., Benavente, R.: The AR face database. Tech. Rep. 24, CVC (1998)
Motzkin, T.S., Straus, E.G.: Maxima for graphs and a new proof of a theorem by turan. Canadian Journal of Math. 17, 533–540 (1965)
Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., Poggio, T.: Pedestrian Detection Using Wavelet Templates. In: CVPR 1997 (1997)
Pavan, M., Pelillo, M.: A new graph-theoretic approach to clustering and segmentation. In: CVPR (2003)
Scholkopf, B., Smola, A.J.: Learning with Kernels. The MIT press, Cambridge (2002)
Tenenbaum, J.B.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Vapnik, V.N.: The nature of statistical learning, 2nd edn. Springer, Heidelberg (1998)
Wagstaf, K., Cardie, A., Rogers, S., Schroedl, S.: Constrained K-means clustering with background knowledge. In: ICML-2001 (2001)
Weinshall, D., Shental, N., Hertz, T., Pavel, M.: Adjustment learning and relevant component analysis. In: ECCV (2002)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an algorithm. NIPS (2001)
Shashua, A., Wolf, L.: Sprse Spectral-based Feature Selection with Side Information. TR 2003-57, Leibniz Center for Research, HUJI (2003)
Wolf, L., Shashua, A.: Kernel principal angles for classification machines with applications to image sequence interpretation. In: CVPR (2003)
Wolf, L., Shashua, A.: Direct feature selection with implicit inference. In: ICCV 2003 (2003)
Xing, E.P., Ng, A.Y., Jordan, M.I., Russel, S.: Distance metric learning, with applications to clustering with side information. In: NIPS 2002 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shashua, A., Wolf, L. (2004). Kernel Feature Selection with Side Data Using a Spectral Approach. In: Pajdla, T., Matas, J. (eds) Computer Vision - ECCV 2004. ECCV 2004. Lecture Notes in Computer Science, vol 3023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24672-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-24672-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21982-8
Online ISBN: 978-3-540-24672-5
eBook Packages: Springer Book Archive