Rotation-DPeak: Improving Density Peaks Selection for Imbalanced Data

Hu, Xiaoliang; Yan, Ming; Chen, Yewang; Yang, Lijie; Du, Jixiang

doi:10.1007/978-981-16-0705-9_4

Xiaoliang Hu^13,14,15,
Ming Yan^13,14,15,
Yewang Chen^13,14,15,
Lijie Yang¹³ &
…
Jixiang Du^13,15

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1320))

Included in the following conference series:

CCF Conference on Big Data

992 Accesses

Abstract

Density Peak (DPeak) is an effective clustering algorithm. It maps arbitrary dimensional data onto a 2-dimensional space, which yields cluster centers and outliers automatically distribute on upper right and upper left corner, respectively. However, DPeak is not suitable for imbalanced data set with large difference in density, where sparse clusters are usually not identified. Hence, an improved DPeak, namely Rotation-DPeak, is proposed to overcome this drawback according to an simple idea: the higher density of a point p, the larger \(\delta \) it should have such that p can be picked as a density peak, where \(\delta \) is the distance from p to its nearest neighbor with higher density. Then, we use a quadratic curve to select points with the largest decision gap as density peaks, instead of choosing points with the largest \(\gamma \), where \(\gamma =\rho \times \delta \). Experiments shows that the proposed algorithm obtains better performance on imbalanced data set, which proves that it is promising.

https://github.com/XFastDataLab/Rotation-DPeak.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Adaptive Clustering Algorithm by Finding Density Peaks

Enhancing Cluster Center Identification in Density Peak Clustering

Relative density-based clustering algorithm for identifying diverse density clusters effectively

Article 13 March 2021

Notes

1.
https://archive.ics.uci.edu/ml/index.php.

References

Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recognit. 36(2), 451–461 (2003)
Article Google Scholar
Zhong, C., Miao, D., FrNti, P.: Minimum spanning tree based split-and-merge: a hierarchical clustering method. Inf. Ences 181(16), 3397–3410 (2011)
Google Scholar
Wang, W., Yang, J., Muntz, R.: Sting: a statistical information grid approach to spatial data mining. In: Proceedings of 23rd International Conference Very Large Data Bases, VLDB 1997, Athens, Greece, pp. 186–195 (1997)
Google Scholar
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Article Google Scholar
Chen, Y., Tang, S., Bouguila, N., Wang, C., Du, J., Li, H.: A fast clustering algorithm based on pruning unnecessary distance computations in DBSCAN for high-dimensional data. Pattern Recognit. 83, 375–387 (2018)
Article Google Scholar
Chen, Y., et al.: KNN-block DBSCAN: fast clustering for large-scale data. IEEE Trans. Syst. Man Cybern. Syst. 1–15 (2019)
Google Scholar
Chen, Y., Zhou, L., Bouguila, N., Wang, C., Chen, Y., Du, J.: Block-DBSCAN: fast clustering for large scale data. Pattern Recognit. 109, 107624 (2021)
Google Scholar
Kang, Z., Wen, L., Chen, W., Xu, Z.: Low-rank kernel learning for graph-based clustering. Knowl. Based Syst. 163, 510–517 (2019)
Article Google Scholar
Kang, Z., et al.: Partition level multiview subspace clustering. Neural Netw. 122, 279–288 (2020)
Article Google Scholar
Xing, Y., Yu, G., Domeniconi, C., Wang, J., Zhang, Z., Guo, M.: Multi-view multi-instance multi-label learning based on collaborative matrix factorization, pp. 5508–5515 (2019)
Google Scholar
Huang, D., Wang, C.D., Wu, J., Lai, J.H., Kwoh, C.K.: Ultra-scalable spectral clustering and ensemble clustering. IEEE Trans. Knowl. Data Eng. 32(6), 1212–1226 (2019)
Article Google Scholar
Zhang, Z., et al.: Flexible auto-weighted local-coordinate concept factorization: a robust framework for unsupervised clustering. IEEE Trans. Knowl. Data Eng. 1 (2019)
Google Scholar
Shi, Y., Chen, Z., Qi, Z., Meng, F., Cui, L.: A novel clustering-based image segmentation via density peaks algorithm with mid-level feature. Neural Comput. Appl. 28(1), 29–39 (2016). https://doi.org/10.1007/s00521-016-2300-1
Article Google Scholar
Bai, X., Yang, P., Shi, X.: An overlapping community detection algorithm based on density peaks. Neurocomputing 226(22), 7–15 (2017)
Article Google Scholar
Liu, D., Su, Y., Li, X., Niu, Z.: A novel community detection method based on cluster density peaks. In: National CCF Conference on Natural Language Processing & Chinese Computing, vol. PP, pp. 515–525 (2017)
Google Scholar
Wang, B., Zhang, J., Liu, Y.: Density peaks clustering based integrate framework for multi-document summarization. CAAI Trans. Intell. Technol. 2(1), 26–30 (2017)
Article Google Scholar
Li, C., Ding, G., Wang, D., Yan, L., Wang, S.: Clustering by fast search and find of density peaks with data field. Chin. J. Electron. 25(3), 397–402 (2016)
Article Google Scholar
Mehmood, R., El-Ashram, S., Bie, R., Sun, Y.: Effective cancer subtyping by employing density peaks clustering by using gene expression microarray. Pers. Ubiquit. Comput. 22(3), 615–619 (2018). https://doi.org/10.1007/s00779-018-1112-y
Article Google Scholar
Cheng, D., Zhu, Q., Huang, J., Wu, Q., Lijun, Y.: Clustering with local density peaks-based minimum spanning tree. IEEE Trans. Knowl. Data Eng. PP(99), 1 (2019). https://doi.org/10.1109/TKDE.2019.2930056
Chen, Y., et al.: Fast density peak clustering for large scale data based on KNN. Knowl. Based Syst. 187, 104824 (2020)
Google Scholar
Chen, Y., et al.: Decentralized clustering by finding loose and distributed density cores. Inf. Sci. 433–434, 649–660 (2018)
MathSciNet Google Scholar
Yaohui, L., Zhengming, M., Fang, Y.: Adaptive density peak clustering based on k-nearest neighbors with aggregating strategy. Knowl. Based Syst. 133, 208–220 (2017)
Article Google Scholar
Liang, Z., Chen, P.: Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recognit. Lett. 73, 52–59 (2016)
Article Google Scholar
Wang, X.F., Xu, Y.: Fast clustering using adaptive density peak detection. Stat. Methods Med. Res. 26(6), 2800–2811 (2017)
Article MathSciNet Google Scholar
Ding, J., He, X., Yuan, J., Jiang, B.: Automatic clustering based on density peak detection using generalized extreme value distribution. In: Soft Computing. A Fusion of Foundations Methodologies & Applications, pp. 515–525 (2018)
Google Scholar

Download references

Acknowledgment

We acknowledge financial support from the National Natural Science Foundation of China (No. 61673186, 61972010, 61975124).

Author information

Authors and Affiliations

The College of Computer Science and Technology, Huaqiao University, Xiamen, China
Xiaoliang Hu, Ming Yan, Yewang Chen, Lijie Yang & Jixiang Du
Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Soochow, China
Xiaoliang Hu, Ming Yan & Yewang Chen
Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen, China
Xiaoliang Hu, Ming Yan, Yewang Chen & Jixiang Du

Authors

Xiaoliang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yewang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lijie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jixiang Du
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yewang Chen .

Editor information

Editors and Affiliations

PLA Academy of Military Sciences, Beijing, China
Hong Mei
Southwest University, Chongqing, China
Weiguo Zhang
The University of Edinburgh, Edinburgh, UK
Wenfei Fan
Southwest University, Chongqing, China
Zili Zhang
Nanjing University, Nanjing, China
Yihua Huang
Zhejiang University, Hangzhou, China
Jiajun Bu
Nanjing University, Nanjing, China
Yang Gao
Taiyuan University of Technology, Taiyuan, China
Li Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, X., Yan, M., Chen, Y., Yang, L., Du, J. (2021). Rotation-DPeak: Improving Density Peaks Selection for Imbalanced Data. In: Mei, H., et al. Big Data. BigData 2020. Communications in Computer and Information Science, vol 1320. Springer, Singapore. https://doi.org/10.1007/978-981-16-0705-9_4

Download citation

DOI: https://doi.org/10.1007/978-981-16-0705-9_4
Published: 01 April 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0704-2
Online ISBN: 978-981-16-0705-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Rotation-DPeak: Improving Density Peaks Selection for Imbalanced Data

Abstract

Access this chapter

Similar content being viewed by others

An Adaptive Clustering Algorithm by Finding Density Peaks

Enhancing Cluster Center Identification in Density Peak Clustering

Relative density-based clustering algorithm for identifying diverse density clusters effectively

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Rotation-DPeak: Improving Density Peaks Selection for Imbalanced Data

Abstract

Access this chapter

Similar content being viewed by others

An Adaptive Clustering Algorithm by Finding Density Peaks

Enhancing Cluster Center Identification in Density Peak Clustering

Relative density-based clustering algorithm for identifying diverse density clusters effectively

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation