Skip to main content
Log in

Joint Spectral Clustering based on Optimal Graph and Feature Selection

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Redundant features and outliers (noise) included in the data points for a machine learning clustering model heavily influences the discovery of more distinguished features for clustering. To solve this issue, we propose a spectral new clustering method to consider the feature selection with the \(L_{2,1}\)-norm regularization as well as simultaneously learns orthogonal representations for each sample to preserve the local structures of data points. Our model also solves the issue of out-of-sample, where the training process does not output an explicit model to predict unseen data points, along with providing an efficient optimization method for the proposed objective function. Experimental results showed that our method on twelve data sets achieves the best performance compared with other similar models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://www.escience.cn/people/chenxiaojun/index.html.

References

  1. Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Synth Lect Artif Intell Mach Learn 3(1):1–130

    MATH  Google Scholar 

  2. Berkhin P (2006) A survey of clustering data mining techniques. In: Kogan J (ed) Grouping multidimensional data. Springer, Berlin, pp 25–71

    Google Scholar 

  3. Bodea CN, Dascalu MI, Lipai A (2012) Clustering of the web search results in educational recommender systems. In: Olga C (ed) Educational recommender systems and technologies: practices and challenges. IGI Global, Pennsylvania, pp 154–181

    Google Scholar 

  4. Fabrizio C et al (2018) 4.2 Paper V: application of data clustering to railway delay pattern recognition. In: Analytical, big data, and simulation models of railway delays, pp 121

  5. Li H, He X, Tao D, Tang Y, Wang R (2018) Joint medical image fusion, denoising and enhancement via discriminative low-rank sparse dictionaries learning. Pattern Recognit 79:130–146

    Google Scholar 

  6. Zhu X, Zhang S, Li Y, Zhang J, Yang L, Fang Y (2018) Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng 31:1532–1543

    Google Scholar 

  7. Zhu Y, Zhong Z, Cao W, Cheng D (2016) Graph feature selection for dementia diagnosis. Neurocomputing 195:19–22

    Google Scholar 

  8. Li X, Li X, Ma H (2020) Deep representation clustering-based fault diagnosis method with unsupervised data applied to rotating machinery. Mech Syst Sig Process 143:106825

    Google Scholar 

  9. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666

    Google Scholar 

  10. Chan PK, Schlag MDF, Zien JY (1994) Spectral k-way ratio-cut partitioning and clustering. IEEE Trans Comp-Aided Des Integr Circuits Syst 13(9):1088–1096

    Google Scholar 

  11. Li Z, Chen J (2015) Superpixel segmentation using linear spectral clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1356–1363

  12. Yan Y, Liu G, Wang S, Zhang J, Zheng K (2017) Graph-based clustering and ranking for diversified image search. Multimed Syst 23(1):41–52

    Google Scholar 

  13. Bunke H, Riesen K (2011) Improving vector space embedding of graphs through feature selection algorithms. Pattern Recognit 44(9):1928–1940

    Google Scholar 

  14. Peng X, Yu Z, Yi Z, Tang H (2017) Constructing the l2-graph for robust subspace learning and subspace clustering. IEEE Trans Cybern 47(4):1053–1066

    Google Scholar 

  15. He W, Zhu X, Cheng D, Hu R, Zhang S (2017) Low-rank unsupervised graph feature selection via feature self-representation. Multimed Tools Appl 76(9):12149–12164

    Google Scholar 

  16. Zhao Z, He X, Cai D, Zhang L, Ng W, Zhuang Y (2015) Graph regularized feature selection with data reconstruction. IEEE Trans Knowl Data Eng 28(3):689–700

    Google Scholar 

  17. Wang S, Zhu W (2018) Sparse graph embedding unsupervised feature selection. IEEE Trans Syst Man Cybern Syst 48(3):329–341

    Google Scholar 

  18. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182

    MATH  Google Scholar 

  19. Inoue A, Kilian L (2005) In-sample or out-of-sample tests of predictability: Which one should we use? Econom Rev 23(4):371–402

    MathSciNet  MATH  Google Scholar 

  20. Zhu P, Zuo W, Zhang L, Hu Q, Shiu SCK (2015) Unsupervised feature selection by regularized self-representation. Pattern Recognit 48(2):438–446

    MATH  Google Scholar 

  21. Vural E, Guillemot C (2016) Out-of-sample generalizations for supervised manifold learning for classification. IEEE Trans Image Process 25(3):1410–1424

    MathSciNet  MATH  Google Scholar 

  22. Zhuang L, Gao H, Lin Z, Ma Y, Zhang X, Yu N (2012) Non-negative low rank and sparse graph for semi-supervised learning. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp 2328–2335. IEEE

  23. Lu X, Wang Y, Yuan Y (2013) Graph-regularized low-rank representation for destriping of hyperspectral images. IEEE Trans Geosci Remote Sens 51(7):4009–4018

    Google Scholar 

  24. Li W, Liu J, Du Q (2016) Sparse and low-rank graph for discriminant analysis of hyperspectral imagery. IEEE Trans Geosci Remote Sens 54(7):4094–4105

    Google Scholar 

  25. Kuang D, Yun S, Park H (2015) Symnmf: nonnegative low-rank approximation of a similarity matrix for graph clustering. J Glob Optim 62(3):545–574

  26. Nie F, Huang H, Cai X, Ding CH (2010) Efficient and robust feature selection via joint l2, 1-norms minimization. In: Avances in neural information processing systems, pp 1813–1821

  27. West DB et al (1996) Introduction to graph theory, vol 2. Prentice hall, Upper Saddle River, NJ

    MATH  Google Scholar 

  28. Hogstedt K, Kimelman D, Rajan VT, Roth T, Wegman M (2001) Graph cutting algorithms for distributed applications partitioning. ACM SIGMETRICS Perform Evaluat Rev 28(4):27–29

    Google Scholar 

  29. Nie F, Wang X, Jordan MI, Huang H (2016) The constrained laplacian rank algorithm for graph-based clustering. In: hirtieth AAAI Conference on Artificial Intelligence

  30. Nie F, Wang H, Deng C, Gao X, Li X, Huang H (2016) New l1-norm relaxations and optimizations for graph clustering. In: Thirtieth AAAI Conference on Artificial Intelligence

  31. Peng X, Yu Z, Yi Z, Tang H (2016) Constructing the l2-graph for robust subspace learning and subspace clustering. IEEE Trans Cybern 47(4):1053–1066

    Google Scholar 

  32. Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562

  33. Yin M, Gao J, Lin Z (2015) Laplacian regularized low-rank representation and its applications. IEEE Trans Pattern Anal Mach Intell 38(3):504–517

    Google Scholar 

  34. Fang X, Xu Y, Li X, Lai Z, Wong WK (2015) Learning a nonnegative sparse graph for linear regression. IEEE Trans Image Process 24(9):2760–2771

    MathSciNet  MATH  Google Scholar 

  35. Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph pca hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044

    Google Scholar 

  36. Shahid N, Perraudin N, Kalofolias V, Puy G, Vandergheynst P (2016) Fast robust pca on graphs. IEEE J Sel Top Sig Process 10(4):740–756

    Google Scholar 

  37. Feng CM, Gao YL, Liu JX, Zheng CH, Yu J (2017) Pca based on graph laplacian regularization and p-norm for gene selection and clustering. IEEE Trans Nanobiosci 16(4):257–265

    Google Scholar 

  38. Chen F, Wang B, Kuo CCJ (2019) Deepwalk-assisted graph pca (dgpca) for language networks. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 2957–2961. IEEE

  39. Montanari A (2015) Finding one community in a sparse graph. J Statist Phys 161(2):273–299

    MathSciNet  MATH  Google Scholar 

  40. Pedarsani R, Yin D, Lee K, Ramchandran K (2017) Phasecode: fast and efficient compressive phase retrieval based on sparse-graph codes. IEEE Trans Inf Theory 63(6):3663–3691

    MathSciNet  MATH  Google Scholar 

  41. Wang S, Zhu W (2016) Sparse graph embedding unsupervised feature selection. IEEE Trans Syst Man Cybern Syst 48(3):329–341

    Google Scholar 

  42. Xue Z, Du P, Li J, Su H (2015) Simultaneous sparse graph embedding for hyperspectral image classification. IEEE Trans Geosci Remote Sens 53(11):6114–6133

    Google Scholar 

  43. Li X, Cui G, Dong Y (2017) Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Trans Cybern 47(11):3840–3853

    MathSciNet  Google Scholar 

  44. Zhuang L, Gao S, Tang J, Wang J, Lin Z, Ma Y, Yu N (2015) Constructing a nonnegative low-rank and sparse graph with data-adaptive features. IEEE Trans Image Process 24(11):3717–3728

    MathSciNet  MATH  Google Scholar 

  45. Li S, Fu Y (2015) Learning balanced and unbalanced graphs via low-rank coding. IEEE Trans Knowl Data Eng 27(5):1274–1287

    Google Scholar 

  46. Yang Y, Shen HT, Nie F, Ji R, Zhou X (2011) Nonnegative spectral clustering with discriminative regularization. In: Twenty-Fifth AAAI Conference on Artificial Intelligence

  47. Von Luxburg U (2007) A tutorial on spectral clustering. Statist Comput 17(4):395–416

    MathSciNet  Google Scholar 

  48. Soltanolkotabi M, Elhamifar E, Candes EJ et al (2014) Robust subspace clustering. Ann Statist 42(2):669–699

    Google Scholar 

  49. Vidal R (2011) Subspace clustering. IEEE Sig Process Mag 28(2):52–68

    Google Scholar 

  50. Yang Y, Ma Z, Yang Y, Nie F, Shen HT (2014) Multitask spectral clustering by exploring intertask correlation. IEEE Trans Cybern 45(5):1083–1094

    Google Scholar 

  51. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Google Scholar 

  52. Kang Z, Peng C, Cheng Q, Xu Z (2018) Unified spectral clustering with optimal graph. In: Thirty-Second AAAI Conference on Artificial Intelligence

  53. Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Twenty-Sixth AAAI Conference on Artificial Intelligence

  54. Pang Y, Yuan Y (2010) Outlier-resisting graph embedding. Neurocomputing 73(4–6):968–974

    Google Scholar 

  55. Nie F, Zhang R, Li X (2017) A generalized power iteration method for solving quadratic problem on the stiefel manifold. Sci China Inf Sci 60(11):112101

    MathSciNet  Google Scholar 

  56. Dodge Y (2012) Statistical data analysis based on the L1-norm and related methods. Birkhäuser, Basel

    Google Scholar 

  57. Kloft M, Brefeld U, Laskov P, Sonnenburg S (2008) Non-sparse multiple kernel learning. In: NIPS Workshop on Kernel Learning: Automatic Selection of Optimal Kernels

  58. Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781

    Google Scholar 

  59. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2012) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184

    Google Scholar 

  60. Nie F, Zhu W, Li X (2017) Unsupervised large graph embedding. In: Thirty-first AAAI conference on artificial intelligence

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinting Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, J., Jang-Jaccard, J., Liu, T. et al. Joint Spectral Clustering based on Optimal Graph and Feature Selection. Neural Process Lett 53, 257–273 (2021). https://doi.org/10.1007/s11063-020-10383-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10383-9

Keywords

Navigation