Abstract
Graph-based spectral clustering techniques have been developed rapidly in various applications such as the image processing, the networks analysis and the pattern recognition. The spectral graph technique mainly transforms the original data into the graph partitioning problem and then some fast numerical methods can be employed to solve its relaxed form. This paper proposes a novel graph-based spectral clustering model via modifying the relaxation ratio Cheeger cut (RRCC) model. To be specific, in order to enhance the robustness of the clustering, we propose to replace the \(\ell ^1\)-norm in the denominator by the \(\ell ^2\)-norm for the RRCC model. Since the proposed model is the fractional optimization problem, we transform it the difference convex problem. With this transformation, the alternating direction of method of multipliers can be employed to solve it. Experimental comparisons demonstrate the superior clustering capabilities of our proposed model and algorithm when dealing with several benchmark databases.
Similar content being viewed by others
Notes
Here we mainly use the fact that \(\mathbb {M}(\mathbf{u} )=0\) is equivalent to \(\displaystyle \sum _{i=1}^Nu_i\le \frac{|\mathbb {V}|}{2}\). In addition, we can easily deduce that \(\Vert \mathbf{u} \Vert _2\ne 0\) since \(\mathbf{u} \) is an indicator function of the clustering results.
For the spectral clustering method of the RNC in the paper (Shi and Malik 2000) and the IPRCC in Hein and Buehler (2010), the codes is available at https://www.ml.uni-saarland.de/code.htm. For the SBRCC used in the paper (Bresson et al. 2014), the code can be obtained from https://github.com/xbresson.
References
Barrett R, Berry M, Chan T et al (1994) Templates for the solution of linear systems: building blocks for iterative methods. SIAM, pp 12–14
Blake C, Merz C (2021) UCI repository of machine learning databases. University of California. http://archive.ics.uci.edu/ml/datasets.php. Accessed 2021
Bresson X, Tai X, Chan T, Szlam A (2014) Multi-class transductive learning based on \(\ell ^1\) relaxations of Cheeger cut and Mumford–Shah–Potts model. J Math Imaging Vis 49(1):191–201
Buehler T, Hein M (2009) Spectral clustering based on the graph p-Laplacian. In: Proceedings of the 26th international conference on machine learning, pp 81–88
Cai D (2021) Codes and datasets for feature learning-popular data sets. http://www.cad.zju.edu.cn/home/dengcai/Data/data.html. Accessed 2021
Chang K, Shao S, Zhang D (2017) Cheeger’s cut, maxcut and the spectral theory of 1-Laplacian on graphs. Sci China Math 60(11):1963–1980
Cheeger J (1969) A lower bound for the smallest eigenvalue of the Laplacian. In: Proceedings of the Princeton conference in honor of Professor S. Bochner, pp 195–199
Chung F (1997) Spectral graph theory. America Mathematics Society
Courant R, Friedrichs K, Lewy H (1967) On the partial difference equations of mathematical physics. IBM J Res Dev 11(2):215–234
Crouzeix J, Ferland J (1991) Algorithms for generalized fractional programming. Math Programm 52(1–3):191–207
Dinkelbach W (1967) On nonlinear fractional programming. Manag Sci 13(7):492–498
Donath W, Hoffman A (1973) Lower bounds for the partitioning of graphs. IBM J Res Dev 17(5):420–425
Feld T, Aujol J, Gilboa G, Papadakis N (2019) Rayleigh quotient minimization for absolutely one-homogeneous functionals. Inverse Probl 35(6):064003
Glowinski R, Osher S, Yin W (2017) Splitting methods in communication. Science, and engineering. Springer, Berlin
Gotoh J, Takeda A, Tono K (2018) DC formulations and algorithms for sparse optimization problems. Math Programm B 169(1):141–176
Hagen L, Kahng A (1991) Fast spectral methods for ratio cut partitioning and clustering. In: IEEE international conference on computer-aided design, pp 10–13
Hagen L, Kahng A (1992) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085
Halkidi M, Gunopulos D, Vazirgiannis M, Kumar N, Domeniconi C (2007) A clustering framework based on subjective and objective validity criteria. J ACM Trans Knowl Discov Data 1(4):119–143
Hartigan J, Wong M (1979) A K-means clustering algorithm. J R Stat Soc 28(1):100–108
Hein M, Buehler T (2010) An inverse power method for nonlinear eigenproblems with applications in 1-spectral clustering and sparse PCA. In: Advances in neural information processing systems, pp 847–855
Hein M, Setzer S (2011) Beyond spectral clustering-tight relaxations of balanced graph cuts. Adv Neural Inf Process Syst 24:2366–2374
Huang D, Wang C, Wu J, Lai J, Kwoh C (2020) Ultra-scalable spectral clustering and ensemble clustering. IEEE Trans Knowl Data Eng 32(6):1212–1226
Khamaru K, Wainwright M (2019) Convergence guarantees for a class of non-convex and non-smooth optimization problems. J Mach Learn Res 20(154):1–52
Kolev K, Cremers D (2009) Continuous ratio optimization via convex relaxation with applications to multiview 3D reconstruction. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1858–1864
Lellmann J, Strekalovskiy E, Koetter S, Cremers D (2013) Total variation regularization for functions with values in a manifold. In: IIEEE international conference on computer vision, pp 2944–2951
Leng C, Zhang H, Cai G, Cheng I, Basu A (2019) Graph regularized \(L_p\) smooth non-negative matrix factorization for data representation. IEEE/CAA J Autom Sin 6(2):584–595
Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 2:1601–1608
Merkurjev E, Bertozzi A, Yan X, Lerman K (2017) Modified Cheeger and ratio cut methods using the Ginzburg–Landau functional for classification of high-dimensional data. Inverse Probl 33(7):074003
Nasraoui O, Benncir C (2019) Clustering methods for big data analytics: techniques, toolboxes and applications. Springer, Berlin
Rahimi Y, Wang C, Dong H, Lou Y (2019) A scale-invariant approach for sparse signal recovery. SIAM J Sci Comput 41(6):A3649–A3672
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Sun J, Qu Q, Wright J (2017) Complete dictionary recovery over the sphere I: overview and the geometric picture. IEEE Trans Inf Theory 63(2):853–884
Szlam A, Bresson X (2010) Total variation and Cheeger cuts. In: Proceedings of the 27th international conference on machine learning, pp 1039–1046
Thi H, Dinh T (2018) DC programming and DCA: thirty years of developments. Math Programm B 169:5–68
Wierzchon S, Klopotek M (2018) Modern algorithms of cluster analysis. Springer, Berlin
Yuille A, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936
Zhang YZ, Jiang Y, Pang ZF (2013) Cheeger cut model for the balanced data classification problem. Adv Mater Res 765–767:730–734
Acknowledgements
The authors would like to thank Dr. Zhi-Feng Pang for the valuable discussions and constructive suggestions on an earlier version of this paper. This work was partially supported by the Natural Science Foundation of HuNan Province (No. 2019JJ40323).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Leonardo de Lima.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, YF., Zhou, H. & Zhou, B. Data clustering based on the modified relaxation Cheeger cut model. Comp. Appl. Math. 41, 61 (2022). https://doi.org/10.1007/s40314-022-01757-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40314-022-01757-x
Keywords
- Data clustering
- Ratio normalized cut
- Ratio Cheeger cut
- Rayleigh quotient
- Alternating direction of method of multipliers