Abstract
Spectral clustering has been demonstrated to often outperform K-means clustering in real applications because it improves the similarity measurement of K-means clustering. However, previous spectral clustering method still suffers from the following issues: 1) easily being affected by outliers; 2) constructing the affinity matrix from original data which often contains redundant features and outliers; and 3) unable to automatically specify the cluster number. This paper focuses on address these issues by proposing a new clustering algorithm along with the technique of half-quadratic optimization. Specifically, the proposed method learns the affinity matrix from low-dimensional space of original data, which is obtained by using a robust estimator to remove the influence of outliers as well as a sparsity regularization to remove redundant features. Moreover, the proposed method employs the ℓ2,1-norm regularization to automatically learn the cluster number according to the data distribution. Experimental results on both synthetic and real data sets demonstrated that the proposed method outperforms the state-of-the-art methods in terms of clustering performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alberto, P., Casbon, J.A., Saqi, M.A.S.: Spectral clustering of protein sequences. Nucleic Acids Res. 34(5), 1571–1580 (2006)
Boyd, V., Faybusovich, L.: Convex optimization. IEEE Trans. Autom. Control 51(11), 1859–1859 (2006)
Charbonnier, P., Blancfraud, L., Aubert, G., Barlaud, M.: Deterministic edge-preserving regularization in computed imaging. IEEE Trans. Image Process. 6(2), 298–311 (1997)
Edgar, R.C.: Search and clustering orders of magnitude faster than blast. Bioinformatics 26(19), 2460 (2010)
Elhamifar, E., Vidal, R.: Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11), 2765–2781 (2013)
Goh, A., Vidal, R.: Locally linear manifold clustering. Journal of Machine Learning Research (2009)
Guangcan, L., Zhouchen, L., Shuicheng, Y., Ju, S., Yu, Y., Yi, M.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 171–184 (2013)
Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. In: ECCV, pp. 492–503 (2008)
He, R., Hu, B., Yuan, X., Wang, L.: M-Estimators and Half-Quadratic minimization (2014)
Hu, H., Lin, Z., Feng, J., Zhou, J.: Smooth representation clustering. In: CVPR, pp. 3834–3841 (2014)
Hu, M., Yang, Y., Shen, F., Xie, N., Hong, R., Shen, H.T.: Collective reconstructive embeddings for cross-modal hashing. IEEE Transactions on Image Processing. https://doi.org/10.1109/TIP.2018.2890144 (2019)
Huber, P.J.: Robust statistics. Wiley, New York (2011)
Ji, Z., Tao, X., Wang, H.: Outlier detection from large distributed databases. World Wide Web 17(4), 539–568 (2014)
Junye, G., Liu, M., Zhang, D.: Spectral clustering algorithm based on effective distance. Journal of Frontiers of Computer Science & Technology (2014)
Law, M.H.C., Figueiredo, J., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9), 1154–1166 (2004)
Lei, C., Zhu, X.: Unsupervised feature selection via local structure learning and sparse learning. Multimed Tools Appl 77(22), 29605–29622 (2018)
Li, C.G., You, C., Vidal, R.: Structured sparse subspace clustering: A joint affinity learning and subspace clustering framework. IEEE Trans. Image Process. 26 (6), 2988–3001 (2017)
Liu, X.Y., Jing-Wei, L.I., Hong, Y.U., You, Q.Z., Lin, H.F.: Adaptive spectral clustering based on shared nearest neighbors. J. Chin. Comput. Syst. 32(9), 1876–1880 (2011)
Liu, G., Lin, Z., Yan, S., Ju, S., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1), 171 (2013)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: NIPS, pp .849–856 (2001)
Nie, F., Wang, X., Jordan, M.I., Huang, H.: The constrained laplacian rank algorithm for graph-based clustering. In: AAAI, pp. 1969–1976 (2016)
Nikolova, M., Ng, MK.: Analysis of Half-Quadratic minimization methods for signal and image recovery. Society for Industrial and Applied Mathematics (2005)
Peng, Y., Zhu, Q., Huang, B.: Spectral clustering with density sensitive similarity function. Knowl.-Based Syst. 24(5), 621–628 (2011)
Roth, V., Lange, T.: Feature selection in clustering problems. In: NIPS (2003)
Shah, S.A., Koltun, V.: Robust continuous clustering. Proc Natl Acad Sci U SA 114(37), 9814–9819 (2017)
Szu-Hao, H., Yi-Hong, C., Shang-Hong, L., Novak, C.L.: Learning-based vertebra detection and iterative normalized-cut segmentation for spinal mri. IEEE Trans. Med. Imaging 28(10), 1595–1605 (2009)
Tao, X., Li, Y., Zhong, N.: A personalized ontology model for Web information gathering. IEEE Trans Knowl Data Eng 23(4), 496–511 (2010)
Wang, R., Zong, M.: Joint self-representation and subspace learning for unsupervised feature selection. World Wide Web 21(6), 1745–1758 (2018)
Wang, W.W., Xiao-Ping, L.I., Feng, X.C., Si, Qi W.: A survey on sparse subspace clustering. Acta Autom. Sin. 41(8), 1373–1384 (2015)
Wang, L., Zheng, K., Tao, X., Han, X.: Affinity propagation clustering algorithm based on large-scale data-set. International Journal of Computers & Applications (3), 1–6 (2018)
Wang, R., Ji, W., Liu, M., Wang, X., Weng, J., Deng, S., Gao, S., Yuan, C.-a.: Review on mining data from multiple data sources. Pattern Recogn. Lett. 109, 120–128 (2018)
Wang, R., Ji, W., Song, B.: Durable relationship prediction and description using a large dynamic graph. World Wide Web 21(6), 1575–1600 (2018)
Yan, J., Pollefeys, M.: A general framework for motion segmentation Independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In: ECCV, pp. 94–106 (2006)
Yi, B., Yang, Y., Shen, F., Xie, N., Shen, H.T., Li, X.: Describing video with attention based bidirectional lstm. IEEE Transactions on Cybernetics (2018)
Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. In: NIPS, pp. 1601–1608 (2005)
Zhang, S., Li, X., Ming, Z., Zhu, X., Cheng, D.: Learning k for knn classification. Acm Trans Intell Syst Technol 8(3), 43 (2017)
Zhang, S., Li, X., Zong, M., Zhu, X., Wang, R.: Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 29 (5), 1774–1785 (2018)
Zheng, W., Zhu, X., Zhu, Y., Hu, R., Lei, Cong: Dynamic graph learning for spectral feature selection. Multimed Tools Appl 77(22), 29739–29755 (2018)
Zhou, X., Shen, F., Liu, Li, Liu, W., Nie, L., Yang, Y., Shen, H.T.: Graph convolutional network hashing (2018)
Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans Cybern 46(2), 450–461 (2016)
Zhu, X., He, W., Li, Y., Yang, Y., Zhang, S., Hu, R., Xhu, Y.: One-step spectral clustering via dynamically learning affinity matrix and subspace, pp. 2963–2969 (2017)
Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X.: Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6), 1263–1275 (2017)
Zhu, X., Li, X., Zhang, S., Xu, Z., Yu, L., Wang, C.: Graph pca hashing for similarity search. IEEE Trans Multimed 19(9), 2033–2044 (2017)
Zhu, X., Shichao, Z., Li, Y., Jilian, Z., Lifeng, Y., Yue, F.: Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng 31(8), 1532–1543 (2019)
Zhu, X., Zhang, S., Hu, R., He, W., Lei, C., Zhu, P.: One-step multi-view spectral clustering. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2018.2873378 (2018)
Zhu, X., Zhang, S., Li, Y., Zhang, J., Yang, L., Fang, Y.: Low-rank sparse subspace for spectral clustering. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2018.2858782
Zhu, X., Zhu, Y., Zhang, S., Hu, R., He, W.: Adaptive hypergraph learning for unsupervised feature selection, 3581–3587, pp. (2018)
Zhu, Y., Zhu, X., Zheng, W.: Robust multi-view learning via half-quadratic minimization, pp. 3278–3284 (2018)
Acknowledgments
This work is partially supported by the China Key Research Program (Grant No: 2016YFB1000905), the Natural Science Foundation of China (Grants No: 61836016, 61876046, 61573270, and 61672177), the Project of Guangxi Science and Technology (GuiKeAD17195062), the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing, the Guangxi High Institutions Program of Introducing 100 High-Level Overseas Talents, the Strategic Research Excellence Fund at Massey University, the Marsden Fund of New Zealand (Grant No: MAU1721), and the Research Fund of Guangxi Key Lab of Multisource Information Mining and Security (18-A-01-01).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Computational Social Science as the Ultimate Web Intelligence
Guest Editors: Xiaohui Tao, Juan D. Velasquez, Jiming Liu, and Ning Zhong
Rights and permissions
About this article
Cite this article
Zhu, X., Gan, J., Lu, G. et al. Spectral clustering via half-quadratic optimization. World Wide Web 23, 1969–1988 (2020). https://doi.org/10.1007/s11280-019-00731-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-019-00731-8