Robust convex clustering

Quan, Zhenzhen; Chen, Songcan

doi:10.1007/s00500-019-04471-9

Robust convex clustering

Foundations
Published: 08 November 2019

Volume 24, pages 731–744, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

635 Accesses
4 Citations
Explore all metrics

Abstract

Objective-based clustering is a class of important clustering analysis techniques; however, these methods are easily beset by local minima due to the non-convexity of their objective functions involved, as a result, impacting final clustering performance. Recently, a convex clustering method (CC) has been on the spot light and enjoys the global optimality and independence on the initialization. However, one of its downsides is non-robustness to data contaminated with outliers, leading to a deviation of the clustering results. In order to improve its robustness, in this paper, an outlier-aware robust convex clustering algorithm, called as RCC, is proposed. Specifically, RCC extends the CC by modeling the contaminated data as the sum of the clean data and the sparse outliers and then adding a Lasso-type regularization term to the objective of the CC to reflect the sparsity of outliers. In this way, RCC can both resist the outliers to great extent and still maintain the advantages of CC, including the convexity of the objective. Further we develop a block coordinate descent approach with the convergence guarantee and find that RCC can usually converge just in a few iterations. Finally, the effectiveness and robustness of RCC are empirically corroborated by numerical experiments on both synthetic and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Optimization for Clustering

Data filtering for cluster analysis by $$\ell _0$$ -norm regularization

Article 20 May 2017

Object Detection Using Convex Clustering – A Survey

References

Ascari G, Fagiolo G, Roventini A (2012) Fat-tail distributions and business-cycle models. Macroecon Dyn 19(2):465–476
Google Scholar
Bache K, Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml
Berkhin P (2006) A survey of clustering data mining techniques. Group Multidimens Data 43(1):25–71
MathSciNet Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
MATH Google Scholar
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):11
MathSciNet MATH Google Scholar
Chen GK, Chi EC, Ranola JMO, Lange K (2015) Convex clustering: an attractive alternative to hierarchical clustering. PLoS Comput Biol 11(5):e1004228
Google Scholar
Chi EC, Lange K (2015) Splitting methods for convex clustering. J Comput Gr Stat 24(4):994–1013
MathSciNet Google Scholar
Chi EC, Allen GI, Baraniuk RG (2016) Convex biclustering. Biometrics 73(1):10–19
MathSciNet MATH Google Scholar
Dave RN, Krishnapuram R (2002) Robust clustering methods: a unified view. IEEE Trans Fuzzy Syst 5(2):270–293
Google Scholar
Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning, ACM, New York, pp 233–240
Dietterich TG (2017) Steps toward robust artificial intelligence. AI Mag 38(3):3–24
Google Scholar
Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627
MathSciNet MATH Google Scholar
Du L, Shen YD (2013) Towards robust co-clustering. In: International joint conferences on artificial intelligence (IJCAI), pp 1317–1322
Fan J, Li R (2001) Variable selection via non-concave penalized likelihood and its oracle properties. Publ Am Stat Assoc 96(456):1348–1360
MATH Google Scholar
Forero PA, Kekatos V, Giannakis GB (2011) Outlier-aware robust clustering. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 2244–2247
Forero PA, Kekatos V, Giannakis GB (2012) Robust clustering using outlier-sparsity regularization. IEEE Trans Signal Process 60(8):4163–4177
MathSciNet MATH Google Scholar
García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2010) A review of robust clustering methods. Adv Data Anal Classif 4(2–3):89–109
MathSciNet MATH Google Scholar
Giannakis GB, Mateos G, Farahmand S, Kekatos V, Zhu H (2011) USPACOR: universal sparsity-controlling outlier rejection. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 1952–1955
Hall LO (2012) Objective function-based clustering. Wiley Interdiscip Rev Data Min Knowl Discov 2(4):326–339
Google Scholar
Hallac D, Leskovec J, Boyd S (2015) Network lasso: clustering and optimization in large graphs. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 387–396
Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New York
MATH Google Scholar
Hocking TD, Joulin A, Bach F, Vert JP (2011) Clusterpath: an algorithm for clustering using convex fusion penalties. In: 28th international conference on machine learning, p 1
Huber PJ (1981) Robust statistics. Wiley, New York
MATH Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
MATH Google Scholar
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle River
MATH Google Scholar
Krijthe JH (2016) RSSL: semi-supervised learning in R. In: International workshop on reproducible research in pattern recognition, Springer, Cham, pp 104–115
Google Scholar
Lindsten F, Ohlsson H, Ljung L (2011) Just relax and come clustering!: a convexification of k-means clustering. Linköping University Electronic Press, Linköping
Google Scholar
Lu C, Yan S, Lin Z (2016) Convex sparse spectral clustering: single-view to multi-view. IEEE Trans Image Process 25(6):2833–2843
MathSciNet MATH Google Scholar
Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE ACM Trans Comput Biol Bioinf 1(1):24–45
Google Scholar
Mateos G, Giannakis GB (2012) Robust PCA as bilinear decomposition with outlier-sparsity regularization. IEEE Trans Signal Process 60(10):5176–5190
MathSciNet MATH Google Scholar
Meng D, Zhao Q, Xu Z (2012) Improve robustness of sparse PCA by L1-norm maximization. Pattern Recognit 45(1):487–497
MATH Google Scholar
Nagorski J, Allen GI (2016) Genomic region detection via spatial convex clustering. arXiv preprint arXiv:1611.04696
Nie F, Wang H, Cai X et al (2012) Robust matrix completion via joint schatten p-norm and lp-norm minimization. In: 2012 IEEE 12th international conference on data mining (ICDM), IEEE, pp 566–574
Oliveira JVD, Pedrycz W et al (2007) Advances in fuzzy clustering and its applications. Wiley, New York
Google Scholar
Parikh N, Boyd S (2014) Proximal algorithms. Found Trends Optim 1(3):127–239
Google Scholar
Poddar S, Jacob M (2018) Clustering of data with missing entries. arXiv preprint arXiv:1801.01455
Tachikawa T, Yatabe K, Ikeda Y, et al (2016) Sound source localization based on sparse estimation and convex clustering. In: Proceedings of meetings on acoustics 172ASA, ASA, vol 29, no 1, p 055004
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288
MathSciNet MATH Google Scholar
Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Ser B (Stat Methodol) 73(3):273–282
MathSciNet MATH Google Scholar
Tošić I, Frossard P (2011) Dictionary learning. IEEE Signal Process Mag 28(2):27–38
MATH Google Scholar
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494
MathSciNet MATH Google Scholar
Wang S, Liu D, Zhang Z (2013) Nonconvex relaxation approaches to robust matrix recovery. In: International joint conferences on artificial intelligence (IJCAI), pp 1764–1770
Wang B, Zhang Y, Sun W et al (2016) Sparse convex clustering. J Comput Gr Stat. https://doi.org/10.1080/10618600.2017.1377081
MathSciNet Google Scholar
Wang Q, Gong P, Chang S et al (2017) Robust convex clustering analysis. In: IEEE international conference on data mining
Weylandt M, Nagorski J, Allen GI (2019) Dynamic visualization and fast computation for convex clustering via algorithmic regularization. J Comput Gr Stat. https://doi.org/10.1080/10618600.2019.1629943
Yuan Y, Sun D, Toh KC (2018) An efficient semismooth Newton based algorithm for convex clustering. arXiv preprint arXiv:1802.07091
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942
MathSciNet MATH Google Scholar
Zhang H, Zha ZJ, Yan S, Wang M, Chua TS (2012) Robust non-negative graph embedding: towards noisy data, unreliable graphs, and noisy labels. In: IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 2464–2471
Zhao Y, Zhu E, Xinwang LIU et al (2019) Simultaneous clustering and optimization for evolving datasets. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2019.2923239
Zhu C, Xu H, Leng C et al (2014) Convex optimization procedure for clustering: theoretical revisit. In: Advances in neural information processing systems (NIPS), pp 1619–1627

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (NSFC) under the Grant Nos. 61732006 and 61672281, as well as the Key Program of NSFC under Grant No. 61472186.

Author information

Authors and Affiliations

School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Zhenzhen Quan & Songcan Chen

Authors

Zhenzhen Quan
View author publications
You can also search for this author in PubMed Google Scholar
Songcan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Songcan Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by A. Di Nola.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Hereafter we provide the proof that the objective function of the RCC in Eq. (4) is strict convex.

First, we present the definition of strict convexity as follows.

Definition 1

A function $ f : {\mathbb{R}}^{n} \to {\mathbb{R}} $ is strictly convex, if $ {\text{dom }}\, f $ is a convex set and if for all $ \varvec{x},\varvec{ y} \in {\text{dom}} f $, $ f(\theta \varvec{x} + (1 - \theta )\varvec{y}) < \theta f(\varvec{x}) + \left( {1 - \theta } \right)f(\varvec{y}) $ whenever $ \varvec{x} \ne \varvec{y} $ and $ 0 < \theta < 1 $ (Boyd and Vandenberghe 2004).

Proof

As is known previously in Eq. (4), $ \varvec{U} \in {\mathbb{R}}^{p \times n} $ and $ \varvec{O} \in {\mathbb{R}}^{p \times n} $

Let $ \tilde{\varvec{U}} = \left[ {\varvec{U},\varvec{O}} \right] \in {\mathbb{R}}^{p \times 2n} $, $ {\tilde{\mathbf{v}}}_{i} = \left[ {\begin{array}{*{20}c} {{\mathbf{y}}_{n(i)} } \\ {{\mathbf{0}}_{n} } \\ \end{array} } \right] \in {\mathbb{R}}^{2n \times 1} $, $ {\hat{\mathbf{v}}}_{i} = \left[ {\begin{array}{*{20}c} {{\mathbf{0}}_{n} } \\ {{\mathbf{y}}_{n(i)} } \\ \end{array} } \right] \in {\mathbb{R}}^{2n \times 1} $, $ {\tilde{\mathbf{I}}} = \left[ {\begin{array}{*{20}c} {{\mathbf{I}}_{n} } \\ {{\mathbf{I}}_{n} } \\ \end{array} } \right] $.

Subsequently, the problem in Eq. (4) can be rewritten as:

$$ F_{\gamma } \left( {\tilde{\varvec{U}}} \right) = \frac{1}{2}\left\| {{\mathbf{X}} - {{\tilde{\varvec{U}}\tilde{I}}}} \right\|_{F}^{2} + \gamma_{1} \sum\nolimits_{i < j} {w_{ij} \left\| {{\tilde{\varvec{U}}(\tilde{v}}_{i} - \tilde{\varvec{v}}_{j} )} \right\|_{2} } + \gamma_{2} \sum\nolimits_{i = 1}^{n} {\left\| {{\tilde{\varvec{U}}\hat{v}}_{i} } \right\|_{1} } $$

(11)

Therefore, proving the convexity of problem (4) turns into proving that Eq. (11) is strictly convex in $ \tilde{\varvec{U}} $. Applying Eq. (11) to Definition 1 for $ \forall \theta \in \left( {0,1} \right) $ and assuming $ \varvec{U}_{1} \ne \varvec{U}_{2} $, we further have

$$ F_{\gamma } \left( {\theta \tilde{\varvec{U}}_{1} \varvec{ + }\left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right) - \theta F_{\gamma } \left( {\tilde{\varvec{U}}_{1} } \right) - \left( {1 - \theta } \right)F_{\gamma } \left( {\tilde{\varvec{U}}_{2} } \right) = T_{1} + T_{2} + T_{3} $$

(12)

where

$$ \begin{aligned} T_{1} & = tr\left\{ {{\tilde{\mathbf{I}}}^{T} } \right.\left[ \theta \right.\tilde{\varvec{U}}_{1}^{T} \left( {\theta \tilde{\varvec{U}}_{1} + \left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right) + \left( {1 - \theta } \right)\tilde{\varvec{U}}_{2}^{T} \left( {\theta \tilde{\varvec{U}}_{1} + \left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right) - \left. {\theta \tilde{\varvec{U}}_{1}^{T} \tilde{\varvec{U}}_{1} - \left( {1 - \theta } \right)\tilde{\varvec{U}}_{2}^{T} \tilde{\varvec{U}}_{2} } \right]{\tilde{\mathbf{I}}} \\ T_{2} & = \gamma_{1} \sum\limits_{i < j} {w_{ij} \left\| {\left( {\theta \tilde{\varvec{U}}_{1} \varvec{ + }\left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right)(\tilde{\varvec{v}}_{i} - \tilde{\varvec{v}}_{j} )} \right\|_{2} } - \gamma_{1} \theta \sum\limits_{i < j} {w_{ij} \left\| {\tilde{\varvec{U}}_{1} (\tilde{\varvec{v}}_{i} - \tilde{\varvec{v}}_{j} )} \right\|_{2} } - \gamma_{1} \left( {1 - \theta } \right)\sum\limits_{i < j} {w_{ij} \left\| {\tilde{\varvec{U}}_{2} (\tilde{\varvec{v}}_{i} - \tilde{\varvec{v}}_{j} )} \right\|_{2} } \\ T_{3} & = \gamma_{2} \sum\limits_{i = 1}^{n} {\left\| {\left( {\theta \tilde{\varvec{U}}_{1} \varvec{ + }\left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right)\hat{\varvec{v}}_{i} } \right\|_{1} } - \gamma_{2} \theta \sum\limits_{i = 1}^{n} {\left\| {\tilde{\varvec{U}}_{1} \hat{\varvec{v}}_{i} } \right\|_{1} } - \gamma_{2} \left( {1 - \theta } \right)\sum\limits_{i = 1}^{n} {\left\| {\tilde{\varvec{U}}_{2} \hat{\varvec{v}}_{i} } \right\|_{1} } \\ \end{aligned} $$

In terms of the absolutely homogeneous and subadditive properties of the norms, T₃ in Eq. (12) is non-positive:

$$ T_{3} \le \gamma_{2} \sum\limits_{i = 1}^{n} {\left\| {\theta \tilde{\varvec{U}}_{1} \hat{\varvec{v}}_{i} } \right\|_{1} } + \gamma_{2} \sum\limits_{i = 1}^{n} {\left\| {\left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} \hat{\varvec{v}}_{i} } \right\|_{1} } - \gamma_{2} \theta \sum\limits_{i = 1}^{n} {\left\| {\tilde{\varvec{U}}_{1} \hat{\varvec{v}}_{i} } \right\|_{1} } - \gamma_{2} \left( {1 - \theta } \right)\sum\limits_{i = 1}^{n} {\left\| {\tilde{\varvec{U}}_{2} \hat{\varvec{v}}_{i} } \right\|_{1} } = \gamma_{2} \theta \sum\limits_{i = 1}^{n} {\left\| {\tilde{\varvec{U}}_{1} \hat{\varvec{v}}_{i} } \right\|_{1} } + \gamma_{2} \left( {1 - \theta } \right)\sum\limits_{i = 1}^{n} {\left\| {\tilde{\varvec{U}}_{2} \hat{\varvec{v}}_{i} } \right\|_{1} } = 0 $$

Similarly, T₂ in Eq. (12) is also non-positive. Thus, Eq. (12) can be simplified to

$$ \begin{aligned} & F_{\gamma } \left( {\theta \tilde{\varvec{U}}_{1} \varvec{ + }\left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right) - \theta F_{\gamma } \left( {\tilde{\varvec{U}}_{1} } \right) - \left( {1 - \theta } \right)F_{\gamma } \left( {\tilde{\varvec{U}}_{2} } \right) \\ & \quad \le tr\left\{ { - \theta \left( {1 - \theta } \right){\tilde{\mathbf{I}}}^{T} } \right.\left( {\tilde{\varvec{U}}_{1} } \right. - \left. {\tilde{\varvec{U}}_{2} } \right)^{T} \left( {\tilde{\varvec{U}}_{1} } \right. - \left. {\tilde{\varvec{U}}_{2} } \right)\left. {{\tilde{\mathbf{I}}}} \right\} \\ & \quad = - \theta \left( {1 - \theta } \right)tr\left\{ {{\tilde{\mathbf{I}}}^{T} } \right.\left( {\tilde{\varvec{U}}_{1} } \right. - \left. {\tilde{\varvec{U}}_{2} } \right)^{T} \left( {\tilde{\varvec{U}}_{1} } \right. - \left. {\tilde{\varvec{U}}_{2} } \right)\left. {{\tilde{\mathbf{I}}}} \right\} \\ & \quad = - \theta \left( {1 - \theta } \right)\left\| {\left( {\tilde{\varvec{U}}_{1} } \right. - \left. {\tilde{\varvec{U}}_{2} } \right){\tilde{\mathbf{I}}}} \right\|_{F}^{2} < 0 \\ \end{aligned} $$

(.)

At last, we obtain that

$$ F_{\gamma } \left( {\theta \tilde{\varvec{U}}_{1} \varvec{ + }\left( {1 - \theta } \right)\tilde{\varvec{U}}_{2} } \right) < \theta F_{\gamma } \left( {\tilde{\varvec{U}}_{1} } \right)\varvec{ + }\left( {1 - \theta } \right)F_{\gamma } \left( {\tilde{\varvec{U}}_{2} } \right) $$

That is, $ F_{\gamma } (\tilde{\varvec{U}}) $ is convex in $ \tilde{\varvec{U}} $. Equivalently, $ F_{\gamma } (\varvec{U},\varvec{O}) $ is jointly convex in $ \left\{ {\varvec{U},\varvec{O}} \right\} $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Quan, Z., Chen, S. Robust convex clustering. Soft Comput 24, 731–744 (2020). https://doi.org/10.1007/s00500-019-04471-9

Download citation

Published: 08 November 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s00500-019-04471-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust convex clustering

Abstract

Access this article

Similar content being viewed by others

Robust Optimization for Clustering

Data filtering for cluster analysis by $$\ell _0$$ -norm regularization

Object Detection Using Convex Clustering – A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Definition 1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust convex clustering

Abstract

Access this article

Similar content being viewed by others

Robust Optimization for Clustering

Data filtering for cluster analysis by $$\ell _0$$ -norm regularization

Object Detection Using Convex Clustering – A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Appendix A

Definition 1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation