Skip to main content
Log in

Robust graph representation clustering based on adaptive data correction

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Impressive performance has been achieved when learning graphs from data in clustering tasks. However, real data often contain considerable noise, which leads to unreliable or inaccurate constructed graphs. In this paper, we propose adaptive data correction-based graph clustering (ADCGC), which can be used to adaptively remove errors and noise from raw data and improve the performance of clustering. The ADCGC method mainly contains three advantages. First, we design the weighted truncated Schatten p-norm (WTSpN) instead of the nuclear norm to recover the low-rank clean data. Second, we choose clean data samples that represent the essential properties of the data as the vertices of the undirected graph, rather than using all the data feature points. Third, we adopt the block-diagonal regularizer to define the edge weights of the graph, which helps to learn an ideal affinity matrix and improve the performance of clustering. In addition, an efficient iterative scheme based on the generalized soft-thresholding operator and alternating minimization is developed to directly solve the nonconvex optimization model. Experimental results show that ADCGC both quantitatively and visually outperforms existing advanced methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Abhadiomhen SE, Wang Z, Shen X (2022) Coupled low rank representation and subspace clustering. Appl Intell 52(1):530–546

    Article  Google Scholar 

  2. Anvari R, Siahsar MAN, Gholtashi S, Kahoo AR, Mohammadi M (2017) Seismic random noise attenuation using synchrosqueezed wavelet transform and low-rank signal matrix approximation. IEEE Trans Geosci Remote Sens 55(11):6574–6581

    Article  Google Scholar 

  3. Cai X, Huang D, Wang CD, Kwoh CK (2020) Spectral clustering by subspace randomization and graph fusion for high-dimensional data. In: Pacific-asia conference on knowledge discovery and data mining, Springer, pp 330–342

  4. Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM (JACM) 58(3):1–37

    Article  MathSciNet  MATH  Google Scholar 

  5. Chen B, Sun H, Xia G, Feng L, Li B (2018) Human motion recovery utilizing truncated schatten p-norm and kinematic constraints. Inf Sci 450:89–108

    Article  MathSciNet  Google Scholar 

  6. Chen J, Yang J (2013) Robust subspace segmentation via low-rank representation. IEEE Trans Cybernet 44(8):1432–1445

    Article  Google Scholar 

  7. Chen Y, Zhou Y, Chen W, Zu S, Huang W, Zhang D (2017) Empirical low-rank approximation for seismic noise attenuation. IEEE Trans Geosci Remote Sens 55(8):4696–4711

    Article  Google Scholar 

  8. Doneva M, Amthor T, Koken P, Sommer K, Börnert P (2017) Matrix completion-based reconstruction for undersampled magnetic resonance fingerprinting data. Magn Reson Imaging 41:41–52

    Article  Google Scholar 

  9. Elhamifar E, Vidal R (2013) Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781

    Article  Google Scholar 

  10. Fazel M (2002) Matrix rank minimization with applications PhD thesis. Stanford University, PhD thesis

    Google Scholar 

  11. Belhumeur P N, Hespanha J P, Kriegman D J (1997) Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on pattern analysis and machine intelligence 19(7):711–720

    Article  Google Scholar 

  12. Gu S, Zhang L, Zuo W, Feng X (2014) Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2862–2869

  13. Gu S, Xie Q, Meng D, Zuo W, Feng X, Zhang L (2017) Weighted nuclear norm minimization and its applications to low level vision. Int J Comput Vis 121(2):183–208

    Article  MATH  Google Scholar 

  14. Guo L, Zhang X, Liu Z, Xue X, Wang Q, Zheng S (2021) Robust subspace clustering based on automatic weighted multiple kernel learning. Inf Sci 573:453–474

    Article  MathSciNet  Google Scholar 

  15. Han Y, Zhu L, Cheng Z, Li J, Liu X (2018) Discrete optimal graph clustering. IEEE Trans Cybernet 50(4):1697–1710

    Article  Google Scholar 

  16. Hu Y, Zhang D, Ye J, Li X, He X (2012) Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans Pattern Anal Mach Intell 35(9):2117–2130

    Article  Google Scholar 

  17. Huang T, Wang S, Zhu W (2020) An adaptive kernelized rank-order distance for clustering non-spherical data with high noise. Int J Mach Learn Cybernet 11(8):1735–1747

    Article  Google Scholar 

  18. Ji P, Reid I, Garg R, Li H, Salzmann M (2017) Low-rank kernel subspace clustering. arXiv:170704974.1

  19. Kang Z, Pan H, Hoi SC, Xu Z (2019a) Robust graph learning from noisy data. IEEE Trans Cybernet 50(5):1833–1843

    Article  Google Scholar 

  20. Kang Z, Wen L, Chen W, Xu Z (2019b) Low-rank kernel learning for graph-based clustering. Knowl-Based Syst 163:510–517

    Article  Google Scholar 

  21. Lang K (1995) Newsweeder: learning to filter netnews. In: Machine Learning Proceedings 1995, Elsevier, pp 331–339

  22. Li J, Liu H, Tao Z, Zhao H, Fu Y (2020) Learnable subspace clustering. IEEE Trans Neural Netw Learn Syst 33:1119–1133

    Article  MathSciNet  Google Scholar 

  23. Li S, Li W, Hu J, Li Y (2022) Semi-supervised bi-orthogonal constraints dual-graph regularized nmf for subspace clustering. Appl Intell 52(3):3227–3248

    Article  Google Scholar 

  24. Li T, Cheng B, Ni B, Liu G, Yan S (2016) Multitask low-rank affinity graph for image segmentation and image annotation. ACM Trans Intell Syst Technol (TIST) 7(4):1–18

    Article  Google Scholar 

  25. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2012) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184

    Article  Google Scholar 

  26. Liu M, Wang Y, Sun J, Ji Z (2020) Structured block diagonal representation for subspace clustering. Appl Intell 50(8):2523–2536

    Article  Google Scholar 

  27. Lu C, Feng J, Lin Z, Mei T, Yan S (2018) Subspace clustering by block diagonal representation. IEEE Trans Pattern Anal Mach Intell 41(2):487–501

    Article  Google Scholar 

  28. Lu G-F, Wang Y, Tang G (2022) Robust low-rank representation with adaptive graph regularization from clean data. Appl Intell 52(5):5830–5840

    Article  Google Scholar 

  29. Lyons MJ, Akamatsu S, Kamachi M, Gyoba J, Budynek J (1998) The japanese female facial expression (jaffe) database. In: Proceedings of third international conference on automatic face and gesture recognition, pp 14–16

  30. Martinez A, Benavente R (1998) The ar face database: Cvc technical report, 24

  31. Nikolova M, Ng MK (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM J Sci Comput 27(3):937–966

    Article  MathSciNet  MATH  Google Scholar 

  32. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  33. Rate C, Retrieval C (2011) Columbia object image library (coil-20). Computer

  34. Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision, IEEE, pp 138–142

  35. Schmitz MA, Heitz M, Bonneel N, Ngole F, Coeurjolly D, Cuturi M, Peyré G, Starck J L (2018) Wasserstein dictionary learning: optimal transport-based unsupervised nonlinear dictionary learning. SIAM J Imaging Sci 11(1):643–678

    Article  MathSciNet  MATH  Google Scholar 

  36. Sim T, Baker S, Bsat M (2002) The cmu pose, illumination, and expression (pie) database. In: Proceedings of fifth IEEE international conference on automatic face gesture recognition, IEEE, pp 53–58

  37. Singh D, Singh B (2019) Hybridization of feature selection and feature weighting for high dimensional data. Appl Intell 49(4):1580–1596

    Article  Google Scholar 

  38. Vidal R (2011) Subspace clustering. IEEE Signal Proc Mag 28(2):52–68

    Article  Google Scholar 

  39. Wang L, Huang J, Yin M, Cai R, Hao Z (2020) Block diagonal representation learning for robust subspace clustering. Inf Sci 526:54–67

    Article  MathSciNet  MATH  Google Scholar 

  40. Xu Y, Chen S, Li J, Luo L, Yang J (2021) Learnable low-rank latent dictionary for subspace clustering. Pattern Recogn 120:108142

    Article  Google Scholar 

  41. Xue X, Zhang X, Feng X, Sun H, Chen W, Liu Z (2020) Robust subspace clustering based on non-convex low-rank approximation and adaptive kernel. Inf Sci 513:190–205

    Article  MathSciNet  MATH  Google Scholar 

  42. Xue Z, Dong J, Zhao Y, Liu C, Chellali R (2019) Low-rank and sparse matrix decomposition via the truncated nuclear norm and a sparse regularizer. Vis Comput 35(11):1549–1566

    Article  Google Scholar 

  43. Xuelong L, Guosheng C, Yongsheng D (2017) Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Transactions on Cybernetics”,“pubMedId”:“27448379 47(11):3840–3853

    Article  Google Scholar 

  44. Yin M, Xie S, Wu Z, Zhang Y, Gao J (2018) Subspace clustering via learning an adaptive low-rank graph. IEEE Trans Image Process 27(8):3716–3728

    Article  MathSciNet  MATH  Google Scholar 

  45. Yuan C, Zhong Z, Lei C, Zhu X, Hu R (2021) Adaptive reverse graph learning for robust subspace learning. Inf Process Manag 58(6):102733

    Article  Google Scholar 

  46. Zhang GY, Chen XW, Zhou YR, Wang CD, Huang D, He XY (2022) Kernelized multi-view subspace clustering via auto-weighted graph learning. Appl Intell 52(1):716–731

    Article  Google Scholar 

  47. Zhang T, Tang Z, Liu Q (2017a) Robust subspace clustering via joint weighted schatten-p norm and l q norm minimization. J Electr Imaging 26(3):033021

    Article  Google Scholar 

  48. Zhang X, Chen B, Sun H, Liu Z, Ren Z, Li Y (2019a) Robust low-rank kernel subspace clustering based on the schatten p-norm and correntropy. IEEE Trans Knowl Data Eng 32(12):2426–2437

    Article  MATH  Google Scholar 

  49. Zhang Z, Jiang W, Qin J, Zhang L, Li F, Zhang M, Yan S (2017b) Jointly learning structured analysis discriminative dictionary and analysis multiclass classifier. IEEE Trans Neural Netw Learn Syst 29(8):3798–3814

    Article  MathSciNet  Google Scholar 

  50. Zhang Z, Zhang Y, Liu G, Tang J, Yan S, Wang M (2019b) Joint label prediction based semi-supervised adaptive concept factorization for robust data representation. IEEE Trans Knowl Data Eng 32(5):952–970

    Article  Google Scholar 

  51. Zheng R, Li M, Liang Z, Wu FX, Pan Y, Wang J (2019) Sinnlrr: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 35(19):3642–3650

    Article  Google Scholar 

  52. Zheng R, Liang Z, Chen X, Tian Y, Cao C, Li M (2020) An adaptive sparse subspace clustering for cell type identification. Front Genet 11:407

    Article  Google Scholar 

  53. Zheng Y, Zhang X, Yang S, Jiao L (2013) Low-rank representation with local constraint for graph construction. Neurocomputing 122:398–405

    Article  Google Scholar 

  54. Zhu P, Zhu W, Hu Q, Zhang C, Zuo W (2017) Subspace clustering guided unsupervised feature selection. Pattern Recogn 66:364–374

    Article  Google Scholar 

Download references

Acknowledgements

This research work was subsidized by the following funds: 62102331, 2020YJ0432 and 2018TZDZX002. We want to thank Canyi Lu, Pan Ji, Zhao Kang and Xuqian Xue for providing the codes for BDR, LRKSC, RGC and LAKRSC, respectively. Finally, we thank the anonymous commenters who made comments for improving our work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhigui Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

We set H = RΛLT to be the SVD of H, and

$$ \begin{array}{@{}rcl@{}} \hat{H} = \arg\underset{H}{\min} \left\| {{\varSigma} - \mathbf{H}} \right\|_{F}^{2} + \left\| \mathbf{H} \right\|_{w,sp}^{p} \end{array} $$
(39)

can be rewritten as

$$ \begin{array}{@{}rcl@{}} (\hat{R},\hat{{\varLambda}},\hat{L}) &=& \arg\underset{R,{\varLambda} ,L}{\min} \left\| {{\varSigma} - R{\varLambda} {L^{T}}} \right\|_{F}^{2} + \left\| {R{\varLambda} {L^{T}}} \right\|_{w,sp}^{p}, \text{s.t} \ R{R^{T}}\\ &=& I,{L^{T}}L = I \\ &=& \arg\underset{R,{\varLambda} ,L}{\min} \left\| {R{\varLambda} {L^{T}} - {\varSigma} } \right\|_{F}^{2} + \left\| {R{\varLambda} {L^{T}}} \right\|_{w,sp}^{p}, \text{s.t} \ R{R^{T}}\\ &=& I,{L^{T}}L = I \end{array} $$
(40)

Equation (40) can be alternately solved.

(1) Updating \((\hat {{R}},\hat {{L}})\):

$$ \begin{array}{@{}rcl@{}} (\hat{R},\hat{L}) = \arg\underset{R,L}{\min} \left\| {R{\varLambda} {L^{T}} - {\varSigma} } \right\|_{F}^{2} \end{array} $$

According to [12], we have

$$ \begin{array}{@{}rcl@{}} \underset{R,L}{\min} \left\| {R{\varLambda} {L^{T}} - {\varSigma} } \right\|_{F}^{2} = tr[{\varLambda} {\varLambda} + {\varSigma} {\varSigma} ] - 2\sum\limits_{j = 1}^{\min (m,n)} {\delta_{j}}({\varSigma} ){\delta_{j}}({\varLambda} ) \end{array} $$

The optimal solution of L and R are the row and column bases of the SVD of Λ.

(1) Updating Λ:

$$ \begin{array}{@{}rcl@{}} \hat{{\varLambda}} = \arg\underset{{\varLambda}}{\min} \left\| {R{\varLambda} {L^{T}} - {\varSigma} } \right\|_{F}^{2} + \left\| {R{\varLambda} {L^{T}}} \right\|_{w,sp}^{p} \end{array} $$
(41)

RΛLT is a diagonal matrix consisting of nonascending combinations, since Λ is a diagonal matrix, L and R are permutation matrices. Equation (41) can be expressed as

$$ \begin{array}{@{}rcl@{}} \hat{{\varLambda}} = \arg\underset{{\varLambda}}{\min} \sum\limits_{j = 1}^{\min (m,n)} {\left\| {{{(R{\varLambda} {L^{T}})}_{jj}} - {{\varSigma}_{jj}}} \right\|_{2}^{2}} {\text{ + }}{\left| {{w_{j}} \cdot {{(R{{\varLambda}^{p}}{L^{T}})}_{jj}}} \right|_{1}} \end{array} $$

Inspired by [12], we use soft-thresholding operation on each component of matrix RΛpLT, and \(\hat {{\varLambda }} = {R^{T}}{S_{w}}({{\varSigma }^{p}})L\) can be obtained.

Equation (39) can be solved by

$$ \begin{array}{@{}rcl@{}} \left\{\begin{array}{l} (R_{t + 1}^{T},{{\varLambda}_{t}},L_{t + 1}^{T}) = SVD({{\varLambda}_{t}})\\ {{\varLambda}_{t + 1}} = R_{t + 1}^{T}{S_{w}}({{\varSigma}^{p}})L_{t + 1}^{T} \end{array}\right. \end{array} $$
(42)

We have \(\mathbf {Q} = {\varGamma } {\hat {R}^{T}}{S_{w}}({{\varSigma }^{p}})\hat {L}{{\varUpsilon }^{T}}\). When the weight vectors are in nondescending order (i.e., \(0 \le {w_{1}} \le {w_{1}} {\cdots } \le {w_{{\min \limits } (m,n)}}\) ), we initialize Λ0 in (42) and obtain

$$ \begin{array}{@{}rcl@{}} \left\{\begin{array}{l} ({R_{1}} = I,{{\varLambda}_{0}},{L_{1}} = I) = SVD({{\varLambda}_{0}})\\ {{\varLambda}_{1}} = I{S_{w}}({{\varSigma}^{p}})I = {S_{w}}({{\varSigma}^{p}}) \end{array}\right. \end{array} $$

Therefore, we can obtain the solution \(\mathbf {Q} = {\varGamma } {S_{w}}({{\varSigma }^{p}}){{\varUpsilon }^{T}}\) of the WTSpNM problem in (19).

Appendix B

Set the objective function to f(Zt,Ct,Qt,Et,Tt), where t is the tth iteration. Let \({J_{1}} = \{ \boldsymbol {C}\left | {{\mathbf {C}_{ii}} = 0,\mathbf {C} \ge 0,\mathbf {C} = {\mathbf {C}^{T}}} \right .\} \) and \({J_{2}} = \{ \mathbf {W}\left | {Tr(\mathbf {W}) = k,0 \le \mathbf {W} \le \mathbf {I}} \right .\}\), defining the indicator function of \({J_{1}} \ \text {and} \ {J_{2}} \ \text {as} \ {l_{{j_{1}}}} \ \text {and} \ {l_{{j_{2}}}}\), respectively.

Inspired by [27], the sequence (Zt,Ct,Qt,Et,Tt) obtained via Algorithm 1 has the following properties:

  1. (1)

    Objective f(Zt,Ct,Qt,Et,Tt) is monotonically decreasing:

    $$ \begin{array}{@{}rcl@{}} &&{}f({\mathbf{Z}^{t + 1}},{\mathbf{C}^{t + 1}},{\mathbf{Q}^{t + 1}},{\mathbf{E}^{t + 1}},{\mathbf{T}^{t + 1}}) + {l_{{j_{1}}}}({\mathbf{C}^{t + 1}}) + {l_{{j_{2}}}}({\mathbf{W}^{t + 1}}) \\ &&{}\le f({\mathbf{Z}^{t}},{\mathbf{C}^{t}},{\mathbf{Q}^{t}},{\mathbf{E}^{t}},{\mathbf{T}^{t}}) + {l_{{j_{1}}}}({\mathbf{C}^{t}}) + {l_{{j_{2}}}}({\mathbf{W}^{t}}) - \frac{\mu}{2}(\left\| {\mathbf{Z}^{t + 1}}\right.\\ &&\left.- {\mathbf{Z}^{t}} \right\|_{F}^{2} + \left\| {{\mathbf{C}^{t + 1}} - {\mathbf{C}^{t}}} \right\|_{F}^{2}\\ &&+ \left\| {{\mathbf{Q}^{t + 1}} - {\mathbf{Q}^{t}}} \right\|_{F}^{2} + \left\| {{\mathbf{E}^{t + 1}} - {\mathbf{E}^{t}}} \right\|_{F}^{2} + \frac{1}{\mu}\left\| {{\mathbf{T}^{t + 1}} - {\mathbf{T}^{t}}} \right\|_{F}^{2}) \end{array} $$
  2. (2)

    Zt+ 1Zt → 0,Ct+ 1Ct → 0,Qt+ 1Qt → 0,Et+ 1Et → 0,Tt+ 1Tt → 0

  3. (3)

    The sequence {Zt},{Ct},{Qt},{Et} and {Tt} are bounded.

Therefore, according to Theorem 7 in the literature [27], we know that the finite point (Z,C,Q,E,T) of (Zt,Ct,Qt,Et,Tt) is a stationary point of (17).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, L., Zhang, X., Zhang, R. et al. Robust graph representation clustering based on adaptive data correction. Appl Intell 53, 17074–17092 (2023). https://doi.org/10.1007/s10489-022-04268-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04268-8

Keywords

Navigation