Abstract
For clustering high-dimensional data, most of the state-of-the-art algorithms often extract principal component beforehand, and then conduct a concrete clustering method. However, the two-stage strategy may deviate from assignments by directly optimizing the unified objective function. Different from the traditional methods, we propose a novel method referred to as clustering by unified principal component analysis and fuzzy c-means (UPF) for clustering high-dimensional data. Our model can explore underlying clustering structure in low-dimensional space and finish clustering simultaneously. In particular, we impose a L0-norm constraint on the membership matrix to make the matrix more sparse. To solve the model, we propose an effective iterative optimization algorithm. Extensive experiments on several benchmark data sets in comparison with two-stage algorithms are conducted to validate effectiveness of the proposed method. The experiments results demonstrate that the performance of our proposed method is superiority.
Submit to ICA3PP 2020 Special Session on Artificial Intelligence and Security (AIS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)
Wang, L., Pan, C.: Robust level set image segmentation via a local correntropy-based KMeans clustering. Pattern Recogn. 47(5), 1917–1925 (2014)
Gong, M., Liang, Y., Shi, J., Ma, W., Ma, J.: Fuzzy c-means clustering with local information and kernel metric for image segmentation. IEEE Trans. Image Process. 22(2), 573–584 (2013)
Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability (1967)
Arthur, D., Vassilvitskii, S.: KMeans++: the advantages of careful seeding. In: Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (2007)
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient KMeans clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)
Frahling, G., Sohler, C.: A fast K-means implementation using coresets. Int. J. Comput. Geom. Appl. 18(6), 605–625 (2008)
Elkan, C.: Using the triangle inequality to accelerate KMeans. In: Proceedings of the ICML (2003)
Huang, Z.: Extensions to the KMeans algorithm for clustering large data sets with categorical values. Data Min. Knowl. Disc. 2(3), 283–304 (1998)
Huang, Z., Ng, M.K.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst. 7(4), 446–452 (1999)
Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated variable weighting in K-means type clustering. IEEE Trans. Pattern Anal. Mach. Intell. 27(5), 657–668 (2005)
Li, M.J., Ng, M.K., Cheung, Y., Huang, J.Z.: Agglomerative fuzzy K-means clustering algorithm with selection of number of clusters. IEEE Trans. Knowl. Data Eng. 20(11), 1519–1534 (2008)
Yi, D., Xian, F.: Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188, 233–238 (2015)
Aparajeeta, J., Nanda, P.K., Das, N.: Modified possibilistic fuzzy c-means algorithms for segmentation of magnetic resonance image. Appl. Soft Comput. 41, 104–119 (2016)
Adhikari, S.K., Sing, J.K., Basu, D.K., Nasipuri, M.: Conditional spatial fuzzy c-means clustering algorithm with application in MRI image segmentation. Adv. Intell. Syst. Comput. 340, 539–547 (2015)
Askari, S., Montazerin, N., Zarandi, M.H.F.: Generalized possibilistic fuzzy c-means with novel cluster validity indices for clustering noisy data. Appl. Soft Comput. 53, 262–283 (2017)
Svante, W., Kim, E., Paul, G.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2, 37–52 (1987)
He, X., Niyogi, P.: Locality preserving projections. In: Proceedings of the NIPS (2003)
Cai, D., He, X., Han, J., Zhang, H.-J.: Orthogonal laplacianfaces for face recognition. IEEE Trans. Image Process. 15(11), 3608–3614 (2006). A Publication of the IEEE Signal Processing Society
Nie, F., Wei, Z., Li, X.: Unsupervised feature selection with structured graph optimization. In: Proceedings of the 13th AAAI Conference on Artificial Intelligence (2016)
Welling, M.: Fisher linear discriminant analysis. Department of Computer Science, University of Toronto. (2005)
Nie, F., Wang, X., Huang, H.: Clustering and projected clustering with adaptive neighbors. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014)
Chang, X., Yang, Y.: Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans. Neural Netw. Learn. Syst. 28, 2294–2305 (2016)
Bezdek, J.C., Hathaway, R.J.: Convergence of alternating optimization. Neural Parallel Sci. Comput. 11(4), 351–368 (2003)
Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision (1994)
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. Yale University New Haven United States, Technical report (1997)
Acknowledgements
The work was partial supported by National Natural Science Foundations of China (61962012), Xing-Long scholar project of Lanzhou University of Finance and Economics, and Gansu Provincial Institutions of Higher Learning Innovation Ability Promotion Project (2019B − 97).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, J., Shi, Q., Yang, Z., Nie, F. (2020). Clustering by Unified Principal Component Analysis and Fuzzy C-Means with Sparsity Constraint. In: Qiu, M. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2020. Lecture Notes in Computer Science(), vol 12453. Springer, Cham. https://doi.org/10.1007/978-3-030-60239-0_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-60239-0_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60238-3
Online ISBN: 978-3-030-60239-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)