Abstract
Spectral clustering is one of the most popular methods for data clustering, and its performance is determined by the quality of the eigenvectors of the related graph Laplacian. Generally, graph Laplacian is constructed using the full features, which will degrade the quality of the related eigenvectors when there are a large number of noisy or irrelevant features in datasets. To solve this problem, we propose a novel unsupervised feature selection method inspired by perturbation analysis theory, which discusses the relationship between the perturbation of the eigenvectors of a matrix and its elements’ perturbation. We evaluate the importance of each feature based on the average L1 norm of the perturbation of the first k eigenvectors of graph Laplacian corresponding to the k smallest positive eigenvalues, with respect to the feature’s perturbation. Extensive experiments on several high-dimensional multi-class datasets demonstrate the good performance of our method compared with some state-of-the-art unsupervised feature selection methods.
Chapter PDF
Similar content being viewed by others
References
Lovasz, L., Plummer, M.: Matching Theory (1986)
Wilkinson, J.H.: The Algebraic Eigenvalue Problem Numerical Mathematics and Scientific Computation, Oxford, pp. 62–104 (1988)
Joel, N.: Franklin: Matrix Theory (2000)
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H. (eds.): Templates for the solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia (2000)
Stewart, G.W.: Matrix Algorithms Volumn II: Eigensystems. SIAM, Philadelphia (2001)
Hagen, L.W., Kahng, A.B.: New spectral methods for ratio cut partitioning and clustering. IEEE Trans. on CAD of Integrated Circuits and Systems (TCAD) 11(9), 1074–1085 (1992)
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell (PAMI) 22(8), 888–905 (2000)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an algorithm. In: NIPS 2001, pp. 849–856 (2001)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing (SAC) 17(4), 395–416 (2007)
Xiang, T., Gong, S.: Spectral clustering with eigenvector selection. Pattern Recognition (PR) 41(3), 1012–1029 (2008)
Huang, L., Yan, D., Jordan, M.I., Taft, N.: Spectral Clustering with Perturbed Data. In: NIPS 2008, pp. 705–712 (2008)
Yan, D., Huang, L., Jordan, M.I.: Fast approximate spectral clustering. In: KDD 2009, pp. 907–916 (2009)
Hunter, B., Strohmer, T.: Performance Analysis of Spectral Clustering on Compressed, Incomplete and Inaccurate Measurements CoRR abs/1011.0997 (2010)
Gong, Y., Huang, T.S.: Incremental Spectral Clustering With Application to Monitoring of Evolving Blog Communities. In: SDM (2007)
Ning, H., Xu, W., Chi, Y., Gong, Y., Huang, T.S.: Incremental spectral clustering by efficiently updating the eigen-system. Pattern Recognition (PR) 43(1), 113–127 (2010)
Song, X., Zhou, D., Hino, K., Tseng, B.L.: Evolutionary spectral clustering by incorporating temporal smoothness. In: KDD 2007, pp. 153–162 (2007)
Coleman, T., Saunderson, J., Wirth, A.: Spectral clustering with inconsistent advice. In: ICML 2008, pp. 152–159 (2008)
Wang, X., Davidson, I.: Flexible constrained spectral clustering. In: KDD 2010, pp. 563–572 (2010)
Bach, F.R., Jordan, M.I.: Learning Spectral Clustering. In: NIPS (2003)
Ozertem, U., Erdogmus, D., Jenssen, R.: Mean shift spectral clustering. Pattern Recognition (PR) 41(6), 1924–1938 (2008)
Bhler, T., Hein, M.: Spectral clustering based on the graph p-Laplacian. In: ICML, p. 11 (2009)
Kim, Y., Street, W.N., Menczer, F.: Feature selection in unsupervised learning via evolutionary search. In: KDD 2000, pp. 365–369 (2000)
Modha, D., Spangler, S.: Feature Weighting in k-Means Clustering. Machine Learning (2002)
Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated Variable Weighting in k-Means Type Clustering. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 27(5), 657–668 (2005)
Boutsidis, C., Mahoney, M.W., Drineas, P.: Unsupervised Feature Selection for the K-means Clustering Problem. In: NIPS 2009 (2009)
Dy, J.G., Brodley, C.E.: Feature Subset Selection and Order Identification for Unsupervised Learning. In: ICML 2000, pp. 247–254 (2000)
Law, M.H.C., Jain, A.K., Figueiredo, M.A.T.: Feature Selection in Mixture-Based Clustering. In: NIPS 2002, pp. 625–632 (2002)
Roth, V., Lange, T.: Feature Selection in Clustering Problems. In: NIPS 2003 (2003)
Jennifer, G.D., Brodley, C.E., Kak, A.C., Broderick, L.S., Aisen, A.M.: Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 25(3), 373–378 (2003)
Jennifer, G.D., Brodley, C.E.: Feature Selection for Unsupervised Learning. Journal of Machine Learning Research (JMLR) 5, 845–889 (2004)
Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous Feature Selection and Clustering Using Mixture Models. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 26(9), 1154–1166 (2004)
Boutemedjet, S., Ziou, D., Bouguila, N.: Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data. In: NIPS 2007 (2007)
Boutsidis, C., Mahoney, M.W., Drineas, P.: Unsupervised feature selection for principal components analysis. In: KDD 2008, pp. 61–69 (2008)
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised FeatureSelection Using Feature Similarity. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 24(3), 301–312 (2002)
Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research (JMLR) 5, 1205–1224 (2004)
He, X., Cai, D., Niyogi, P.: Laplacian Score for Feature Selection. In: NIPS 2005 (2005)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML 2007, pp. 1151–1157 (2007)
Nie, F., Xiang, S., Jia, Y., Zhang, C., Yan, S.: Trace Ratio Criterion for Feature Selection. In: AAAI 2008, pp. 671–676 (2008)
Cai, D., Zhang, C., He, X.: Unsupervised feature selection for multi-cluster data. In: KDD 2010, pp. 333–342 (2010)
Zhao, Z., Wang, L., Liu, H.: Efficient Spectral Feature Selection with Minimum Redundancy. In: AAAI 2010 (2010)
Jiang, Y., Ren, J.: Eigenvalue Sensitive Feature Selection. In: ICML 2011 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, Y., Ren, J. (2011). Eigenvector Sensitive Feature Selection for Spectral Clustering. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23783-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-23783-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23782-9
Online ISBN: 978-3-642-23783-6
eBook Packages: Computer ScienceComputer Science (R0)