Skip to main content
Log in

Joint learning affinity matrix and representation matrix for robust low-rank multi-kernel clustering

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-kernel subspace clustering has attracted widespread attention, because it can process nonlinear data effectively. It usually solves the representation coefficient between data by the subspace clustering optimization model, and then the constructed affinity matrix is input into the spectral clustering method to get the final clustering result. Obviously, the quality of the affinity matrix (graph) has a significant impact on the final clustering result. Unfortunately, there are two deficiencies in the previous multi-kernel subspace clustering methods as follows: 1) this typical two-phase method restricts the learning of the affinity matrix; 2) it does not fully extract the data global structure mapped to the kernel space. In order to solve these two problems simultaneously, a novel low-rank multi-kernel subspace clustering method incorporating a joint learning scheme, namely JALSC, is proposed. The innovation of this method is reflected in the following two aspects: 1) the adaptive local structure is used to learn the representation of the data and the affinity graph in the integrated objective function at the same time. The optimal affinity graph obtained by the one-step learning scheme helps to improve the clustering performance; 2) our method uses a non-convex low-rank approximation function to constrain the consensus kernel to preserve the global structure of the data after mapping to the feature space. A mass of experiments on several commonly used datasets show that JALSC obtains the best clustering performance and has better robustness compared with several advanced multi-kernel clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Chen B, Sun H, Xia G, Feng L, Li B (2018) Human motion recovery utilizing truncated schatten p-norm and kinematic constraints. Inf Sci 450:89–108

    Article  MathSciNet  Google Scholar 

  2. Chen J, Yang S, Mao H, Fahy C (2021) Multiview subspace clustering using low-rank representation. IEEE Transactions on Cybernetics

  3. Chen X, Pan L (2018) A survey of graph cuts/graph search based medical image segmentation. IEEE Rev Biomed Eng 11:112–124. https://doi.org/10.1109/RBME.2018.2798701

    Article  Google Scholar 

  4. Chen X, Ye Y, Xu X, Huang JZ (2012) A feature group weighting method for subspace clustering of high-dimensional data. Pattern Recogn 45(1):434–446

    Article  Google Scholar 

  5. Dattorro J (2010) Convex optimization & Euclidean distance geometry. Lulu Com

  6. Ding C, Li T, Jordan MI (2008) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55

    Article  Google Scholar 

  7. Ding S, Jia H, Du M, Xue Y (2018) A semi-supervised approximate spectral clustering algorithm based on hmrf model. Inf Sci 429:215–228

    Article  MathSciNet  Google Scholar 

  8. Djenouri Y, Comuzzi M (2017) Combining apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf Sci 420:1–15

    Article  Google Scholar 

  9. Du L, Zhou P, Shi L, Wang H, Fan M, Wang W, Shen YD (2015) Robust multiple kernel k-means using l21-norm. In: Twenty-fourth international joint conference on artificial intelligence

  10. Elhamifar E, Vidal R (2009) Sparse subspace clustering. In: IEEE Conference on computer vision & pattern recognition

  11. Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781

    Article  Google Scholar 

  12. Ghaemi R, Sulaiman MN, Ibrahim H, Mustapha N, et al. (2009) A survey: clustering ensembles techniques. World Academy of Science. Eng Technol 50:636–645

    Google Scholar 

  13. Gu S, Xie Q, Meng D, Zuo W, Feng X, Zhang L (2017) Weighted nuclear norm minimization and its applications to low level vision. Int J Comput Vis 121(2):183–208

    Article  Google Scholar 

  14. Guo X (2015) Robust subspace segmentation by simultaneously learning data representations and their affinity matrix. In: Twenty-fourth international joint conference on artificial intelligence

  15. Ho J, Yang MH, Lim J, Lee KC, Kriegman D (2003) Clustering appearances of objects under varying illumination conditions. In: 2003 IEEE Computer society conference on computer vision and pattern recognition, 2003. Proceedings. IEEE, vol 1, pp i–i

  16. Huang H, Chuang YY, Chen CS (2011) Multiple kernel fuzzy clustering. IEEE Trans Fuzzy Syst 20(1):120–134

    Article  Google Scholar 

  17. Huang H, Chuang YY, Chen CS (2012) Affinity aggregation for spectral clustering. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 773–780. IEEE

  18. Huang S, Kang Z, Xu Z (2018) Self-weighted multi-view clustering with soft capped norm. Knowl-Based Syst 158:1–8

    Article  Google Scholar 

  19. Huang S, Wang H, Li T, Li T, Xu Z (2018) Robust graph regularized nonnegative matrix factorization for clustering. Data Min Knowl Disc 32(2):483–503

    Article  MathSciNet  Google Scholar 

  20. Kang Z, Lu X, Yi J, Xu Z (2018) Self-weighted multiple kernel learning for graph-based clustering and semi-supervised classification. arXiv:1806.07697

  21. Kang Z, Peng C, Cheng Q (2017) Twin learning for similarity and clustering: a unified kernel approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 31

  22. Kang Z, Peng C, Cheng Q, Xu Z (2018) Unified spectral clustering with optimal graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32

  23. Kang Z, Wen L, Chen W, Xu Z (2019) Low-rank kernel learning for graph-based clustering. Knowl-Based Syst 163:510– 517

    Article  Google Scholar 

  24. Lai H, Pan Y, Lu C, Tang Y, Yan S (2014) Efficient k-support matrix pursuit. In: European conference on computer vision. Springer, pp 617–631

  25. Lewis DP, Jebara T, Noble WS (2006) Nonstationary kernel combination. In: Proceedings of the 23rd international conference on Machine learning, pp 553–560

  26. Li CG, Vidal R (2015) Structured sparse subspace clustering: a unified optimization framework. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 277–286

  27. Li Y, Zhao Q, Luo K (2021) Multi-objective soft subspace clustering in the composite kernel space. Inf Sci 563:23–39. https://doi.org/10.1016/j.ins.2021.02.008. https://www.sciencedirect.com/science/article/pii/S0020025521001420

  28. Liang Q, Zhang X, Luo L (2020) Robust multiple kernel subspace clustering based on low rank consensus kernel learning. In: Proceedings of the 2020 4th International Conference on Electronic Information Technology and Computer Engineering, pp 621–626

  29. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2012) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184

    Article  Google Scholar 

  30. Liu G, Lin Z, Yu Y, et al. (2010) Robust subspace segmentation by low-rank representation. In: Icml, vol 1. Citeseer, pp 8

  31. Liu M, Wang Y, Sun J, Ji Z (2020) Structured block diagonal representation for subspace clustering. Appl Intell:1–14

  32. Liu M, Wang Y, Sun J, Ji Z (2021) Adaptive low-rank kernel block diagonal representation subspace clustering. Appl Intell:1–16

  33. Lu C, Feng J, Lin Z, Mei T, Yan S (2018) Subspace clustering by block diagonal representation. IEEE Trans Pattern Anal Mach Intell 41(2):487–501

    Article  Google Scholar 

  34. Lu C, Feng J, Lin Z, Yan S (2013) Correlation adaptive subspace segmentation by trace lasso. In: Proceedings of the IEEE international conference on computer vision, pp 1345–1352

  35. Lu C, Tang J, Lin M, Lin L, Yan S, Lin Z (2013) Correntropy induced l2 graph for robust subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 1801–1808

  36. Mi Y, Ren Z, Mukherjee M, Huang Y, Sun Q, Chen L (2021) Diversity and consistency embedding learning for multi-view subspace clustering. Appl Intell:1–14

  37. Nie F, Huang H, Ding C (2012) Low-rank matrix recovery via efficient schatten p-norm minimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 26

  38. Nie F, Wang X, Huang H (2014) Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 977–986

  39. Ren Z, Li H, Yang C, Sun Q (2020) Multiple kernel subspace clustering with local structural graph and low-rank consensus kernel learning. Knowl-Based Syst 188:105040

    Article  Google Scholar 

  40. Shi X, Guo Z, Xing F, Cai J, Yang L (2018) Self-learning for face clustering. Pattern Recogn 79:279–289

    Article  Google Scholar 

  41. Wang S, Yuan X, Yao T, Yan S, Shen J (2011) Efficient subspace segmentation via quadratic programming. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 25

  42. Wang W, Shen J, Yang R, Porikli F (2017) Saliency-aware video object segmentation. IEEE Trans Pattern Anal Mach Intell 40(1):20–33

    Article  Google Scholar 

  43. Wang YX, Xu H, Leng C (2013) Provable subspace clustering: When lrr meets ssc. In: NIPS, vol 1, pp 5

  44. Xie Y, Gu S, Liu Y, Zuo W, Zhang W, Zhang L (2016) Weighted schatten p-norm minimization for image denoising and background subtraction. IEEE Trans Image Process 25(10):4842–4857

    Article  MathSciNet  Google Scholar 

  45. Yan J, Pollefeys M (2008) A factorization-based approach for articulated nonrigid shape, motion and kinematic chain recovery from video. IEEE Trans Pattern Anal Mach Intell 30(5):865–877

    Article  Google Scholar 

  46. Yang C, Ren Z, Sun Q, Wu M, Yin M, Sun Y (2019) Joint correntropy metric weighting and block diagonal regularizer for robust multiple kernel subspace clustering. Inf Sci 500:48–66

    Article  MathSciNet  Google Scholar 

  47. Zhang T, Tang Z, Liu Q (2017) Robust subspace clustering via joint weighted schatten-p norm and lq norm minimization. J Electron Imaging 26(3):033021

  48. Zhou S, Ou Q, Liu X, Wang S, Liu L, Wang S, Zhu E, Yin J, Xu X (2021) Multiple kernel clustering with compressed subspace alignment. IEEE Trans Neural Netw Learn Syst:1–12. https://doi.org/10.1109/TNNLS.2021.3093426

  49. Zhu R, Xue JH (2017) On the orthogonal distance to class subspaces for high-dimensional data classification. Inf Sci 417:262–273

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported in part by the Sichuan Province Science and Technology Support Program under Grant Nos.2020YJ0432, 2020YFS0360, 18YYJC1688 and 18ZB0611, and the National Natural Science Foundation of China under Grant 62102331, 62176125 and 61772272.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoqian Zhang.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been supported in part by the Sichuan Province Science and Technology Support Program under Grant Nos.2020YJ0432, 2020YFS0360, 18YYJC1688 and 18ZB0611, the National Natural Science Foundation of China under Grant 62102331,62176125 and 61772272.

Appendices

Appendix A

Lemma 1

Let the singular value decomposition (SVD) of \(\boldsymbol {P}\in \mathbb {R}^{m\times n}\) be P = UΣVT with Σ = diag(σ1,σ2,…,σi),where \(r={\min \limits } (m,n)\).If we have σ1 > σ2 > ⋯ > σr,and the corresponding weights ω1ω2 ≤⋯ ≤ ωr,for the following problem:

$$ \min_{\boldsymbol{Q} \in \boldsymbol{R}^{m \times n}}\left( \frac{1}{2}\|\boldsymbol{Q}-\boldsymbol{P}\|_{F}^{2}+\lambda\|\boldsymbol{Q}\|_{w, S_{p}}^{p}\right) $$
(A.1)

its closed solution is Q = UΛVT,and Λ is the singular value diagonal matrix of the matrix Q.Then, problem (A.1) can be translated into the solution of Λ = diag(δ1,δ2,...,δr) in the following problem:

$$ \begin{array}{c} \underset{\delta_{1}, \delta_{2}, \ldots, \delta_{r}} \min {\sum}_{i=1}^{r}\left[\frac{1}{2}\left( \delta_{i}-\sigma_{i}\right)^{2}+\lambda w_{i} {\delta_{i}^{p}}\right], \\ \text { s.t. } \delta_{i} \geq 0 ,i=1,2, {\ldots} r \end{array} $$
(A.2)

The Generalized Soft Threshold (GST) algorithm can be implemented to solve (A.2) by decoupling it into r independent sub-problems, which is described in Lemma 2.

Lemma 2

Given yR and λ > 0, we have the following optimization problem of x:

$$ \min_{x} \left[ \frac{1}{2}(x-y)^{2}+\lambda\left|x\right|^{p} \right] $$
(A.3)

Let \(\tau _p^{GST}(\lambda )\) and x be the thresholds and the corresponding minimum values in the GST method, and their solutions are obtained by substituting them into (A.3), i.e

$$ \frac{1}{2}\left( x^{*}-\tau_{p}^{GST}(\lambda)\right)^{2}+\lambda (x^{*})^{p}=\frac{1}{2}(\tau_{p}^{GST}(\lambda))^{2} $$
(A.4)

After a series of simple mathematical substitution of the above equation, we can get:

$$ \begin{array}{c} x^{*}=[2\lambda(1-p)]^{\frac{1}{2-p}}\\ \tau_{p}^{GST}(\lambda)=[2\lambda(1-p)]^{\frac{1}{2-p}}+\lambda p [2\lambda(1-p)]^{\frac{p-1}{2-p}} \end{array} $$
(A.5)

Then, we get the solution of formula (A.3):

$$ \begin{array}{c} x^{*}=\left\{\begin{array}{ll}0 & |y| \leq \tau_{p}^{\text{GST}}(\lambda) \\\operatorname{sgn}(y) S_{p}^{GST}(|y|, \lambda) & |y|>\tau_{p}^{GST}(\lambda) \end{array}\right. \end{array} $$
(A.6)

where \(S_{p}^{GST}(|y|, \lambda )\) can be obtained by solving follow problem:

$$ \lambda p (S_{p}^{GST}(|y|, \lambda))^{p-1}+S_{p}^{GST}(|y|, \lambda)-|y|=0 $$
(A.7)

Appendix B

Given the following vector form problem:

$$ \underset{\boldsymbol{c}}{\arg\min}\|\boldsymbol{c}+\boldsymbol{d}\|_{F}^{2} \quad \text { s.t. } \boldsymbol{c}^{T} \boldsymbol{1}=1,\boldsymbol{c} \succeq 0 $$
(A.8)

where \(\boldsymbol {c}\in \mathbb {R}^{n\times {1}}\) and \(\boldsymbol {d}\in \mathbb {R}^{n\times {1}}\) are target vector and known distance vector,respectively. Specifying that the learned affinity graph c has the first k important edges,so in theory there are k non-zero terms in c.The solution of the (A.8) is:

$$ \boldsymbol{c}=\left( \frac{1+{\sum}_{i=1}^{k}\boldsymbol{d}_{i}^{\to}}{k}\boldsymbol{1}-\boldsymbol{d}\right)_{+} $$
(A.9)

where \(\boldsymbol {d}_{i}^{\to }\) is the ascending order of the elements in d.

Proof

Due to the existence of constraint cT1 = 1,c ≽ 0 in problem (A.8), its Lagrangian function is written as:

$$ \mathcal{L}(\boldsymbol{c},\kappa,\rho)=\frac{1}{2}\|\boldsymbol{c}+\boldsymbol{d}\|_{F}^{2}-\kappa\left( \boldsymbol{c}^{T} \boldsymbol{1}-1\right)-\rho^{T} \boldsymbol{c} $$
(A.10)

where κ ≽ 0 and ρ ≽ 0 are the Lagrangian multipliers.By means of the KKT condition, we can obtain the following relation:

$$ \left\{\begin{array}{l}\boldsymbol{c}+\boldsymbol{d}-\kappa \boldsymbol{1}-\rho=0 \\ \boldsymbol{c}^{T} \mathbf{1}-1=0 \\ \rho^{T} \boldsymbol{c}=0 \end{array}\right. $$
(A.11)

Only when the conditions ci > 0 and ρi = 0 are met, the third equation holds.We have

$$ \boldsymbol{c}=(\kappa \boldsymbol{1}-\boldsymbol{d})_{+} $$
(A.12)

There are k positive elements in c ≽ 0. Namely:

$$ \kappa-{\boldsymbol{d}_{k}^{\to}}>0 ,\kappa-{\boldsymbol{d}_{k+1}^{\to}} \leq 0 $$
(A.13)

According to (A.12) together with cT1 = 1, we have

$$ {\sum}_{i=1}^{k}\left( \kappa-{\boldsymbol{d}_{i}^{\to}}\right)=1 $$
(A.14)
$$ \kappa=\frac{1+{\sum}_{i=1}^{k} \boldsymbol{d}_{i}^{\to}}{k} $$
(A.15)

Finally, by simply substituting the κ in above equation into (A.12) , a closed-form solution in the same form as (A.8) is obtained. □

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, L., Liang, Q., Zhang, X. et al. Joint learning affinity matrix and representation matrix for robust low-rank multi-kernel clustering. Appl Intell 52, 13987–14004 (2022). https://doi.org/10.1007/s10489-021-02974-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02974-3

Keywords

Navigation