Skip to main content
Log in

Data clustering based on the modified relaxation Cheeger cut model

  • Published:
Computational and Applied Mathematics Aims and scope Submit manuscript

Abstract

Graph-based spectral clustering techniques have been developed rapidly in various applications such as the image processing, the networks analysis and the pattern recognition. The spectral graph technique mainly transforms the original data into the graph partitioning problem and then some fast numerical methods can be employed to solve its relaxed form. This paper proposes a novel graph-based spectral clustering model via modifying the relaxation ratio Cheeger cut (RRCC) model. To be specific, in order to enhance the robustness of the clustering, we propose to replace the \(\ell ^1\)-norm in the denominator by the \(\ell ^2\)-norm for the RRCC model. Since the proposed model is the fractional optimization problem, we transform it the difference convex problem. With this transformation, the alternating direction of method of multipliers can be employed to solve it. Experimental comparisons demonstrate the superior clustering capabilities of our proposed model and algorithm when dealing with several benchmark databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Here we mainly use the fact that \(\mathbb {M}(\mathbf{u} )=0\) is equivalent to \(\displaystyle \sum _{i=1}^Nu_i\le \frac{|\mathbb {V}|}{2}\). In addition, we can easily deduce that \(\Vert \mathbf{u} \Vert _2\ne 0\) since \(\mathbf{u} \) is an indicator function of the clustering results.

  2. For the spectral clustering method of the RNC in the paper (Shi and Malik 2000) and the IPRCC in Hein and Buehler (2010), the codes is available at https://www.ml.uni-saarland.de/code.htm. For the SBRCC used in the paper (Bresson et al. 2014), the code can be obtained from https://github.com/xbresson.

References

  • Barrett R, Berry M, Chan T et al (1994) Templates for the solution of linear systems: building blocks for iterative methods. SIAM, pp 12–14

  • Blake C, Merz C (2021) UCI repository of machine learning databases. University of California. http://archive.ics.uci.edu/ml/datasets.php. Accessed 2021

  • Bresson X, Tai X, Chan T, Szlam A (2014) Multi-class transductive learning based on \(\ell ^1\) relaxations of Cheeger cut and Mumford–Shah–Potts model. J Math Imaging Vis 49(1):191–201

    Article  MathSciNet  Google Scholar 

  • Buehler T, Hein M (2009) Spectral clustering based on the graph p-Laplacian. In: Proceedings of the 26th international conference on machine learning, pp 81–88

  • Cai D (2021) Codes and datasets for feature learning-popular data sets. http://www.cad.zju.edu.cn/home/dengcai/Data/data.html. Accessed 2021

  • Chang K, Shao S, Zhang D (2017) Cheeger’s cut, maxcut and the spectral theory of 1-Laplacian on graphs. Sci China Math 60(11):1963–1980

    Article  MathSciNet  Google Scholar 

  • Cheeger J (1969) A lower bound for the smallest eigenvalue of the Laplacian. In: Proceedings of the Princeton conference in honor of Professor S. Bochner, pp 195–199

  • Chung F (1997) Spectral graph theory. America Mathematics Society

  • Courant R, Friedrichs K, Lewy H (1967) On the partial difference equations of mathematical physics. IBM J Res Dev 11(2):215–234

    Article  MathSciNet  Google Scholar 

  • Crouzeix J, Ferland J (1991) Algorithms for generalized fractional programming. Math Programm 52(1–3):191–207

    Article  MathSciNet  Google Scholar 

  • Dinkelbach W (1967) On nonlinear fractional programming. Manag Sci 13(7):492–498

    Article  MathSciNet  Google Scholar 

  • Donath W, Hoffman A (1973) Lower bounds for the partitioning of graphs. IBM J Res Dev 17(5):420–425

    Article  MathSciNet  Google Scholar 

  • Feld T, Aujol J, Gilboa G, Papadakis N (2019) Rayleigh quotient minimization for absolutely one-homogeneous functionals. Inverse Probl 35(6):064003

    Article  MathSciNet  Google Scholar 

  • Glowinski R, Osher S, Yin W (2017) Splitting methods in communication. Science, and engineering. Springer, Berlin

    MATH  Google Scholar 

  • Gotoh J, Takeda A, Tono K (2018) DC formulations and algorithms for sparse optimization problems. Math Programm B 169(1):141–176

    Article  MathSciNet  Google Scholar 

  • Hagen L, Kahng A (1991) Fast spectral methods for ratio cut partitioning and clustering. In: IEEE international conference on computer-aided design, pp 10–13

  • Hagen L, Kahng A (1992) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085

    Article  Google Scholar 

  • Halkidi M, Gunopulos D, Vazirgiannis M, Kumar N, Domeniconi C (2007) A clustering framework based on subjective and objective validity criteria. J ACM Trans Knowl Discov Data 1(4):119–143

    Google Scholar 

  • Hartigan J, Wong M (1979) A K-means clustering algorithm. J R Stat Soc 28(1):100–108

    MATH  Google Scholar 

  • Hein M, Buehler T (2010) An inverse power method for nonlinear eigenproblems with applications in 1-spectral clustering and sparse PCA. In: Advances in neural information processing systems, pp 847–855

  • Hein M, Setzer S (2011) Beyond spectral clustering-tight relaxations of balanced graph cuts. Adv Neural Inf Process Syst 24:2366–2374

    Google Scholar 

  • Huang D, Wang C, Wu J, Lai J, Kwoh C (2020) Ultra-scalable spectral clustering and ensemble clustering. IEEE Trans Knowl Data Eng 32(6):1212–1226

    Article  Google Scholar 

  • Khamaru K, Wainwright M (2019) Convergence guarantees for a class of non-convex and non-smooth optimization problems. J Mach Learn Res 20(154):1–52

    MathSciNet  MATH  Google Scholar 

  • Kolev K, Cremers D (2009) Continuous ratio optimization via convex relaxation with applications to multiview 3D reconstruction. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1858–1864

  • Lellmann J, Strekalovskiy E, Koetter S, Cremers D (2013) Total variation regularization for functions with values in a manifold. In: IIEEE international conference on computer vision, pp 2944–2951

  • Leng C, Zhang H, Cai G, Cheng I, Basu A (2019) Graph regularized \(L_p\) smooth non-negative matrix factorization for data representation. IEEE/CAA J Autom Sin 6(2):584–595

    Article  MathSciNet  Google Scholar 

  • Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  • Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 2:1601–1608

    Google Scholar 

  • Merkurjev E, Bertozzi A, Yan X, Lerman K (2017) Modified Cheeger and ratio cut methods using the Ginzburg–Landau functional for classification of high-dimensional data. Inverse Probl 33(7):074003

    Article  MathSciNet  Google Scholar 

  • Nasraoui O, Benncir C (2019) Clustering methods for big data analytics: techniques, toolboxes and applications. Springer, Berlin

    Book  Google Scholar 

  • Rahimi Y, Wang C, Dong H, Lou Y (2019) A scale-invariant approach for sparse signal recovery. SIAM J Sci Comput 41(6):A3649–A3672

    Article  MathSciNet  Google Scholar 

  • Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  • Sun J, Qu Q, Wright J (2017) Complete dictionary recovery over the sphere I: overview and the geometric picture. IEEE Trans Inf Theory 63(2):853–884

    Article  MathSciNet  Google Scholar 

  • Szlam A, Bresson X (2010) Total variation and Cheeger cuts. In: Proceedings of the 27th international conference on machine learning, pp 1039–1046

  • Thi H, Dinh T (2018) DC programming and DCA: thirty years of developments. Math Programm B 169:5–68

    Article  MathSciNet  Google Scholar 

  • Wierzchon S, Klopotek M (2018) Modern algorithms of cluster analysis. Springer, Berlin

    Book  Google Scholar 

  • Yuille A, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936

    Article  Google Scholar 

  • Zhang YZ, Jiang Y, Pang ZF (2013) Cheeger cut model for the balanced data classification problem. Adv Mater Res 765–767:730–734

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Dr. Zhi-Feng Pang for the valuable discussions and constructive suggestions on an earlier version of this paper. This work was partially supported by the Natural Science Foundation of HuNan Province (No. 2019JJ40323).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Fei Yang.

Additional information

Communicated by Leonardo de Lima.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, YF., Zhou, H. & Zhou, B. Data clustering based on the modified relaxation Cheeger cut model. Comp. Appl. Math. 41, 61 (2022). https://doi.org/10.1007/s40314-022-01757-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s40314-022-01757-x

Keywords

Mathematics Subject Classification

Navigation