Abstract
In order to realize the real-time progressive compression of massive data and ensure the quality of compressed data, a real-time progressive compression method of massive data based on improved clustering algorithm is proposed in this paper. Through the micro clustering stage of birch method based on K-Medoids clustering, Clustering Feature Tree hierarchy is constructed and numerical clustering features are extracted; Taking this feature as the input of macro clustering order, the Clustering Feature Tree leaf nodes are clustered based on the improved K-Medoids clustering method, and the clustering data cluster set is output; The set is used as the original data of real-time progressive compression, and the data is denoised and compressed by lifting format wavelet transform. On this basis, Huffman coding is used to compress the data losslessly. The test results show that this method has good clustering effect under the optimal number of clustering centers, can complete the real-time progressive compression of a large number of data, and the availability of compressed data is more than 92%.
Similar content being viewed by others
Data availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
References
Shah, H., Danish, Z., Tairan, N., Gazali, R., Badshah, A.: Global artificial bee colony search algorithm for data clustering. Int. J. swarm Intel. Res. 10(2), 48–59 (2019)
Uthayakumar, J., Vengattaraman, T., Dhavachelvan, P.: A new lossless neighborhood indexing sequence (nis) algorithm for data compression in wireless sensor networks. Ad. Hoc. Netw. 83(2), 149–157 (2019)
Chen, C., Ding, Y., Xie, X., Zhang, S., Feng, L.: Trajcompressor: an online map-matching-based trajectory compression framework leveraging vehicle heading direction and change. IEEE Trans. Intell. Transp. Syst. 99, 1–17 (2019)
Zhu, L.F., Wang, J.S., Wang, H.Y., Xie, W.: Data clustering method based on improved bat algorithm with six convergence factors and local search operators. IEEE Access 99, 1–1 (2020)
Ma, J., Yu, H.: Study on the computer desktop image compression technology based on clustering algorithm. Paper Asia 2(1), 11–14 (2019)
Sun, X., Ma, H., Sun, Y., Liu, M.: A novel point cloud compression algorithm based on clustering. IEEE Robot. Autom. Lett. 4(2), 2132–2139 (2019)
Kalaivani, S., Tharini, C., Saranya, K., Priyanka, K.: Design and implementation of hybrid compression algorithm for personal health care big data applications. Wirel. Pers. Commun. 113(1), 599–615 (2020)
Zhu, X., Wang, Y., Li, Y., Tan, Y., Wang, G., Song, Q.: A new unsupervised feature selection algorithm using similarity-based feature clustering. Comput. Intell. 35(1), 2–22 (2019)
Chen, L., Guo, Q., Liu, Z., Chen, L., Jin, Y.: An improved gravitational clustering based on local density. Int. J. Mob. Comput. Multimed. Commun. 12(1), 1–22 (2021)
Rad, M.H., Abdolrazzagh-Nezhad, M.: Data cube clustering with improved dbscan based on fuzzy logic and genetic algorithm. Inf. Technol. Control 49(1), 127–143 (2020)
Du, H., Ni, Y., Wang, Z.: An improved algorithm based on fast search and find of density peak clustering for high-dimensional data. Wirel. Commun. Mob. Comput. 2021(5), 1–12 (2021)
Liu, H., Wang, Y., Ma, Y.: Deep neural networks compression based on improved clustering. Kongzhi Lilun Yu Yinyong/Control Theory Appl. 36(7), 1130–1136 (2019)
Chen, S., Wang, Z., Zhang, H., Yang, G., Wang, K.: Fog-based optimized kronecker-supported compression design for industrial iot. IEEE Trans. Sustain. Comput. 5(1), 95–106 (2020)
Kavya, K.: Literature survey of image compression/decompression techniques for space and telehealth applications. Oxid. Commun. 42(2), 151–159 (2019)
Xu, Y.N., Liu, M.Z., Wang, S.N.: Research of security protocol and data compression method for in-vehicle flexray network. Int. J. Comput. Commun. Eng 9(1), 18–32 (2020)
Kawami, R., Kitahara, D., Hirabayashi, A., Yoshikawa, E., Ushio, T.: Three-dimensional data compression and fast high-quality reconstruction for phased array weather radar. IEEJ Trans. Electron. Inf. Syst. 140(1), 40–48 (2020)
Taddei, T.: A registration method for model order reduction: data compression and geometry reduction. SIAM J. Sci. Comput. 42(2), A997–A1027 (2020)
Zhao, J., Li, S.: Adaptive mesh refinement method for solving optimal control problems using interpolation error analysis and improved data compression—ScienceDirect. J. Franklin Inst. 357(3), 1603–1627 (2020)
Liu, W.L., Yang, H.: Improved simulation research of dynamic data fusion algorithm. Comput. Simul. 16(4), 2469–2476 (2019)
Long, H., Zhang, X.M.: Research on Intelligent compression method of multimedia weakly correlated data based on big data. Mode. electron. technol. 43(19), 102–105 (2020)
Dong, W.T., Yu, H., Zhou, Y.Z., et al.: Fishing vessel AIS trajectory data compression algorithm based on improved sliding window. J. Ocean Univ. 35(3), 468–535 (2020)
Li, H.B., Yuan, X.P., Gan, S., et al.: Point cloud data compression method based on feature point and key point extraction. Laser infrared 51(9), 1129–1136 (2021)
Funding
No funding was received.
Author information
Authors and Affiliations
Contributions
All authors have made great contributions to the research. The main design of the work, the analysis of data and the revising of the work were performed by HY. The first draft of the manuscript and the experiments were finished by HY, LL, KL. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
We declare there is no conflict of interest.
Ethical approval
This manuscript is not be submitted to more than one journal for simultaneous consideration, which is original and not published elsewhere in any form or language. This study is not be split up into several parts to increase the quantity of submissions. The results is presented honestly without inappropriate data manipulation.
Informed consent
This manuscript has not been previously published and is not currently in press, under review, or being considered for publication by another journal. All authors have read and approved the manuscript being submitted and agree to its submittal to this journal.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, H., Li, L. & Li, K. Real-time progressive compression method of massive data based on improved clustering algorithm. Cluster Comput 26, 3781–3791 (2023). https://doi.org/10.1007/s10586-022-03780-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-022-03780-3