Skip to main content
Log in

Density peaks algorithm based on information entropy and merging strategy for power load curve clustering

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

To solve the problems of density peaks clustering (DPC) algorithm sensitive to cutoff distance and subjectivity of clustering center selection, we propose an improved density peaks algorithm based on information entropy and merging strategy (DPC-IEMS) for realizing power load curve clustering. First, a cutoff distance optimization method based on information entropy is proposed. This method uses sparrow search algorithm (SSA) to find the minimum value of information entropy about the product of local density and relative distance to calculate the optimal cutoff distance suitable for the load datasets. Then, a merging strategy is proposed to realize the adaptive selection of clustering centers. This strategy first generates a large number of initial sub-clusters by DPC, and then merges the sub-clusters using the fusion condition until the final iteration condition is satisfied. The performance of DPC-IEMS algorithm is evaluated on the U.S. load datasets and the Chinese load datasets, and the effectiveness and practicality of DPC-IEMS algorithm for power load curve clustering are fully demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Availability of data and materials

The U.S. load data are available to download from https://dx.doi.org/10.25984/1876417. The Chinese load data cannot be shared for privacy reasons.

Abbreviations

DPC:

Density peaks clustering algorithm

SSA:

Sparrow search algorithm

FCM:

Fuzzy C-means

KNN:

K-Nearest neighbor

SC:

Silhouette coefficient

CH:

Calinski Harabasz score

DVI:

Dunn validity Index

DBI:

Davies Bouldin score

WPD:

Wavelet packet decomposition

DWT:

Discrete wavelet transform

PCA:

Principal component analysis

DPC-IE:

DPC algorithm based on information entropy

DPC-MS:

DPC algorithm based on merging strategy

DPC-IEMS:

Density peaks algorithm based on information entropy and merging strategy

\(\rho_{i}\) :

The local density of DPC

\(\delta_{i}\) :

The relative distance of DPC

\(d_{c}\) :

The cutoff distance of DPC

\(d_{ij}\) :

The Euclidean distance between point i and point j

\(\gamma_{i}\) :

The product of local density and relative distance

\(CL^{\prime}\) :

The initial sub-clusters

\(ICl^{\prime}\) :

The initial sub-cluster center indexes

\(Cl^{\prime}_{j}\) :

The j-th initial sub-clusters

\(icl^{\prime}_{j}\) :

The j-th initial sub-cluster center index

\(d_{near}\) :

Distance between the cluster center of the initial subcluster and the cluster's nearest neighbor curve

FT:

The fusion threshold

References

  1. Gungor VC, Sahin D, Kocak T et al (2011) Smart grid technologies: communication technologies and standards. IEEE Trans Industr Inf 7(4):529–539. https://doi.org/10.1109/TII.2011.2166794

    Article  Google Scholar 

  2. Yang S, Shen C (2013) A review of electric load classification in smart grid environment. Renew Sustain Energy Rev 24:103–110. https://doi.org/10.1016/j.rser.2013.03.023

    Article  Google Scholar 

  3. Jia M, Wang Y, Shen C et al (2020) Privacy-preserving distributed clustering for electrical load profiling. IEEE Tran Smart Grid 12(2):1429–1444. https://doi.org/10.1109/TSG.2020.3031007

    Article  Google Scholar 

  4. Shikhin VA, Shikhina AV, Kouzalis A (2022) Automated electricity price forecast using combined models. Autom Remote Control 83(1):153–163. https://doi.org/10.1134/S0005117922010118

    Article  Google Scholar 

  5. Dinesh C, Makonin S, Bajić IV (2019) Residential power forecasting using load identification and graph spectral clustering. IEEE Trans Circuits Syst II Exp Briefs 66(11):1900–1904. https://doi.org/10.1109/TCSII.2019.2891704

    Article  Google Scholar 

  6. Aurangzeb K, Alhussein M, Javaid K et al (2021) A pyramid-CNN based deep learning model for power load forecasting of similar-profile energy customers based on clustering. IEEE Access 9:14992–15003. https://doi.org/10.1109/ACCESS.2021.3053069

    Article  Google Scholar 

  7. Nie Y, Jiang P, Zhang H (2020) A novel hybrid model based on combined preprocessing method and advanced optimization algorithm for power load forecasting. Appl Soft Comput 97:106809. https://doi.org/10.1016/j.asoc.2020.106809

    Article  Google Scholar 

  8. Cheng Z, Wang L, Yang Y (2023) A hybrid feature pyramid CNN-LSTM model with seasonal inflection month correction for medium-and long-term power load forecasting. Energies 16(7):3081. https://doi.org/10.3390/en16073081

    Article  Google Scholar 

  9. Guo B, Xu Y, Li R et al (2018) Power User Profile under Multi-source Heterogeneous Data Fusion in Smart Grid. DEStech Trans. Comput. Sci. Eng. 10:1–6

    Google Scholar 

  10. Wang J, Zhong H, Ma Z et al (2017) Review and prospect of integrated demand response in the multi-energy system. Appl Energy 202:772–782. https://doi.org/10.1016/j.apenergy.2017.05.150

    Article  Google Scholar 

  11. Zhao Z, Wang J, Liu Y (2017) User electricity behavior analysis based on K-means plus clustering algorithm. In: 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC), pp. 484–487. IEEE Computer Society, Dalian, China. https://doi.org/10.1109/ICCTEC.2017.00111

  12. Binh PTT, Le TN, Xuan NP (2018) Advanced som & k mean method for load curve clustering. Int. J. Electric. Comput. Eng. 8(6):4829

    Google Scholar 

  13. Panapakidis IP, Christoforidis GC (2017) Implementation of modified versions of the K-means algorithm in power load curves profiling. Sustain Cities Soc 35:83–93. https://doi.org/10.1016/j.scs.2017.08.002

    Article  Google Scholar 

  14. Qtaish A, Braik M, Albashish D et al (2023) Optimization of K-means clustering method using hybrid capuchin search algorithm. J Supercomput 2023:1–60. https://doi.org/10.1007/s11227-023-05540-5

    Article  Google Scholar 

  15. Dong R, Huang MX (2014) An improved FCM algorithm based on subtractive clustering for power load classification. Adv Mater Res 986:206–210. https://doi.org/10.4028/www.scientific.net/AMR.986-987.206

    Article  Google Scholar 

  16. Shang C, Gao J, Liu H et al (2021) Short-term load forecasting based on PSO-KFCM daily load curve clustering and CNN-LSTM model. IEEE Access 9:50344–50357. https://doi.org/10.1109/ACCESS.2021.3067043

    Article  Google Scholar 

  17. Gao C, Wu Y, Tang J et al (2020) Daily power load curves analysis based on grey wolf optimization clustering algorithm. In: Proceedings of PURPLE MOUNTAIN FORUM 2019-International Forum on Smart Grid Protection and Control: Volume II, pp. 661–671. Springer Singapore, Nanjing, China. https://doi.org/10.1007/978-981-13-9783-7_54

  18. Zhang Y, Li X, Wang L et al (2023) An autocorrelation incremental fuzzy clustering framework based on dynamic conditional scoring model. Inf Sci 648:119567. https://doi.org/10.1016/j.ins.2023.119567

    Article  Google Scholar 

  19. Ezugwu AE, Ikotun AM, Oyelade OO et al (2022) A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng Appl Artif Intell 110:104743. https://doi.org/10.1016/j.engappai.2022.104743

    Article  Google Scholar 

  20. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496. https://doi.org/10.1126/science.1242072

    Article  Google Scholar 

  21. Li Q, Wang G, Zhang Y et al (2023) Analysis of user electricity consumption behavior based on density peak clustering with shared neighbors and attractiveness. Concurrency and Comput: Practice and Exp 35(3):e7518. https://doi.org/10.1002/cpe.7518

    Article  Google Scholar 

  22. Chen J, Ding J, Tian S et al (2018) An improved density peaks clustering algorithm for power load profiles clustering analysis. Power Syst Protect Control 46(20):85–93. https://doi.org/10.7667/PSPC171386

    Article  Google Scholar 

  23. Du H, Zhai Q, Wang Z et al (2022) A dynamic density peak clustering algorithm based on k-nearest neighbor. Security and Commun Netw 2022:1–15. https://doi.org/10.1155/2022/7378801

    Article  Google Scholar 

  24. Yin S, Wu R, Li P, et al (2022) Density Peaks Clustering Algorithm Based on K Nearest Neighbors. In: Advances in Intelligent Systems and Computing: Proceedings of the 7th Euro-China Conference on Intelligent Data Analysis and Applications, pp. 129–144. Singapore: Springer Nature, Hangzhou, China. https://doi.org/10.1007/978-981-16-8048-9_13

  25. Wang C, Qi X, Li W et al (2021) Clustering of residential power consumption behavior based on improved density peaks method. In: 2021 IEEE Sustainable Power and Energy Conference (iSPEC), pp. 2412–2416. IEEE, Nanjing, China. https://doi.org/10.1109/iSPEC53008.2021. 9736054

  26. Han Y, Li K, Ge F et al (2021) Online fault diagnosis for sucker rod pumping well by optimized density peak clustering. ISA Trans 120:222–234. https://doi.org/10.1016/j.isatra.2021.03.022

    Article  Google Scholar 

  27. Jiang D, Zang W, Sun R et al (2020) Adaptive density peaks clustering based on K-nearest neighbor and Gini coefficient. IEEE Access 8:113900–113917. https://doi.org/10.1109/ACCESS.2020.3003057

    Article  Google Scholar 

  28. Xu T, Jiang J (2022) A graph adaptive density peaks clustering algorithm for automatic centroid selection and effective aggregation. Expert Syst Appl 195:116539. https://doi.org/10.1016/j.eswa.2022.116539

    Article  Google Scholar 

  29. Yang Q, Yin S, Li Q et al (2022) Analysis of electricity consumption behaviors based on principal component analysis and density peak clustering. Concurrency and Comput: Practice and Exp 34(21):e7126. https://doi.org/10.1002/cpe.7126

    Article  Google Scholar 

  30. Ziwen GU, Peng LI, Xun L et al (2021) A multi-granularity density peak clustering algorithm based on variational mode decomposition. Chin J Electron 30(4):658–668. https://doi.org/10.1049/cje.2021.03.001

    Article  Google Scholar 

  31. Sun L, Qin X, Ding W et al (2022) Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy. Neurocomputing 473:159–181. https://doi.org/10.1016/j.neucom.2021.12.019

    Article  Google Scholar 

  32. Ding S, Du W, Xu X et al (2023) An improved density peaks clustering algorithm based on natural neighbor with a merging strategy. Inf Sci 624:252–276. https://doi.org/10.1016/j.ins.2022.12.078

    Article  Google Scholar 

  33. Wei X, Peng M, Huang H et al (2023) An overview on density peaks clustering. Neurocomputing 554:126633. https://doi.org/10.1016/j.neucom.2023.126633

    Article  Google Scholar 

  34. Xue J, Shen B (2020) A novel swarm intelligence optimization approach: sparrow search algorithm. Syst Sci Control Eng 8(1):22–34. https://doi.org/10.1080/21642583.2019.1708830

    Article  Google Scholar 

  35. Li N, Wu X, Dong J et al (2022) A density-based matrix transformation clustering method for electrical load. PLoS ONE 17(8):e0272767. https://doi.org/10.1371/journal.pone.0272767

    Article  Google Scholar 

  36. Wand MP (1997) Data-based choice of histogram bin width. Am Stat 51(1):59–64

    Article  Google Scholar 

  37. Ivezić Ž (2014) Statistics, data mining, and machine learning in astronomy. In: Ivezić Ž, Connolly AJ, VanderPlas JT, Gray A (eds) Statistics, data mining, and machine learning in astronomy. Princeton University Press, pp 153–156

    Chapter  Google Scholar 

  38. Freedman D, Diaconis P (1981) On the histogram as a density estimator: L 2 theory. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 57(4):453–476. https://doi.org/10.1007/BF01025868

    Article  MathSciNet  Google Scholar 

  39. National Renewable Energy Laboratory (NREL). (2021). End-Use Load Profiles for the U.S. Building Stock . Retrieved from https://doi.org/10.25984/1876417.

  40. Bai Y, Zhou Y, Liu J (2022) Clustering analysis of daily load curve based on deep convolution embedding clustering. Power Syst Technol 46(6):1–11

    Google Scholar 

  41. Wang J, Wang K, Jia R et al (2020) Research on load clustering based on singular value decomposition and k-means clustering algorithm. In: 2020 Asia Energy and Electrical Engineering Symposium (AEEES), pp.831–835. IEEE, Chengdu, China https://doi.org/10.1109/AEEES48850.2020.9121555

  42. Rajabi A, Eskandari M, Ghadi MJ et al (2020) A comparative study of clustering techniques for electrical load pattern segmentation. Renew Sustain Energy Rev 120:109628. https://doi.org/10.1016/j.rser.2019.109628

    Article  Google Scholar 

  43. Rhif M, Ben Abbes A, Farah IR et al (2019) Wavelet transform application for/in non-stationary time-series analysis: a review. Appl Sci 9(7):1345. https://doi.org/10.3390/app9071345

    Article  Google Scholar 

  44. Zhang C, Huang C, Wang Y et al (2022) Clustering analysis of user load characteristics under new power system based on improved k-means clustering algorithm. In: 2022 5th World Conference on Mechanical Engineering and Intelligent Manufacturing (WCMEIM), pp.1019–1022. IEEE, Ma’anshan, China. https://doi.org/10.1109/WCMEIM56910.2022.10021391

  45. Bai Y, Fang H, Huang H, et al (2022) A novel improved approach for fast and accurate load clustering in power system. In: 2022 IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Vol. 5, 1627–1632

Download references

Funding

This work was supported by the National Natural Science Foundation of China (No.42075129) and Hebei Province Natural Science Foundation (No.E2021202179).

Author information

Authors and Affiliations

Authors

Contributions

YY: Conceptualization, Methodology, Formal analysis, Validation, Investigation, Software, Writing- Original Draft, Resources, Visualization. LW: Conceptualization, Methodology, Formal analysis, Validation, Writing- Original Draft, Writing—Review & Editing, Funding acquisition. ZC: Data Curation, Visualization.

Corresponding author

Correspondence to Li Wang.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Wang, L. & Cheng, Z. Density peaks algorithm based on information entropy and merging strategy for power load curve clustering. J Supercomput 80, 8801–8832 (2024). https://doi.org/10.1007/s11227-023-05793-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05793-0

Keywords

Navigation