Density Peak Clustering Based on Cumulative Nearest Neighbors Degree and Micro Cluster Merging
- 99 Downloads
Rodriguez et al. published an algorithm called clustering by fast search and find of density peaks (DPC) in Science in June 2014. It can quickly search the density peaks and cluster the datasets efficiently. However, there are some drawbacks. First, the local density definition is simple for the datasets with both dense clusters and sparse clusters; the density peaks cannot be found correctly using the two local density definition methods. Second, there is poor assignment fault tolerance, if a point is misallocated, the subsequent assignment will further amplify the error, which will have a serious impact on the clustering results. To solve the problems, a new clustering method, density peak clustering based on cumulative nearest neighbors degree and micro cluster merging, is proposed. The proposed method improves the DPC algorithm in two ways, the one is that the method defines a new local density to solve the defect of the DPC algorithm; the other one is that the graph degree linkage is combined with the DPC to alleviate the problem of distribution errors. The experiments on synthetic and real-world datasets show that the proposed method outperforms DPC, DBSCAN, OPTICS, AP, K-Means and other DPC variant algorithms.
KeywordsDensity peak clustering K-nearest neighbors Local density Micro cluster Allocation strategy
This research was supported by the National Natural Science Foundation of China under Grant (Nos. 71433003, 51669014), the Science Fund for Distinguished Young Scholars of Jiangxi Province under Grant (No. 2018ACB21029).
- 7.Wang, G., Cai, X., Cui, Z., et al. (2017). High performance computing for cyber physical social systems by using evolutionary multi-objective optimization algorithm[J]. IEEE Transactions on Emerging Topics in Computing. https://doi.org/10.1109/TETC.2017.2703784.
- 12.Macqueen, J. (1967). Some methods for classification and analysis of multi variate observations[C]. Proceedings of Berkeley symposium on mathematical statistics &probability, 281–297.Google Scholar
- 14.Ester M. A. (1996). Density-based algorithm for discovering clusters in large spatial databases with noise[C]. Proceedings of the second ACM international conference on knowledge discovery and data mining, 226–231.Google Scholar
- 16.Wei, W., Yang, J., & Muntz, R. R. (1997). STING: A statistical information grid approach to spatial data mining[C]. Proceedings of the 23rd international conference on very large data bases, 186–195.Google Scholar
- 21.Zhang, W., Wang, X., & Zhao, D., et al. (2012). Graph degree linkage: Agglomerative clustering on a directed graph[C]. Proceedings of the European conference on computer vision, 428–441.Google Scholar
- 23.Xue, X., Gan, S., Peng, H., et al. (2018). Improved density peaks clustering algorithm combining K-nearest neighbors[J]. Computer Engineering and Applications, 54(7), 36–43.Google Scholar
- 24.Du, M., Ding, S., & Xue, Y. (2017). A robust density peaks clustering algorithm using fuzzy neighborhood[J]. International Journal of Machine Learning & Cybernetics, 12, 1–10.Google Scholar
- 26.Qiu, B., & Cheng, L. (2018). A parameter-free clustering algorithm based on Laplace centrality and density peaks. Journal of Computer Applications, 38(9), 2511–2514.Google Scholar
- 29.Ankerst, M., Breunig, M. M., & Kriegel, H.-P., et al. (1999). Optics: Ordering points to identify the clustering structure[C]. Proceedings of the ACM Sigmod Record, 49–60.Google Scholar
- 33.Jain, A.K., & Law, M.H. (2005). Data clustering: A user’s dilemma[C]. Proceedings of the international conference on pattern recognition and machine intelligence, : 1–10.Google Scholar
- 40.Bache, K., & Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml. Irvine: University of California.
- 41.Street, W. N., Wolberg, W. H., & Mangasarian, O. L. (1993). Nuclear feature extraction for breast tumor diagnosis[C]. Proceedings of the IS&T/SPIE International Symposium on Electronic Imaging:Science and Technology, 1905, 861–870.Google Scholar
- 42.Charytanowicz, M., Niewczas, J., Kulczycki, P., et al. (2010). Complete gradient clustering algorithm for features analysis of x-ray images [J]. Advances in Intelligent and Soft Computing, 69, 15-24.Google Scholar
- 43.Dias, D. B., Madeo, R. C. B., & Rocha T., et al. (2009). Hand movement recognition for brazilian sign language: A study using distance-based neural networks[C]. Proceedings of the international joint on neural networks, 697–704.Google Scholar
- 44.Sigillito, V. G., Wing, S. P., Hutton, L. V., et al. (1989). Classification of radar returns from the ionosphere using neural networks[J]. Johns Hopkins APL Technical Digest, 10(3), 262–266.Google Scholar