Abstract
Aiming at the problems of traditional methods that cannot adapt to the interference of noise or abnormal data, the data mining time is long, and the data mining accuracy is low, a network abnormal data stream mining method based on improved clustering analysis is proposed. By establishing a preprocessing model for abnormal network data flow, real-time data flow query is realized. Construct a network abnormal incremental data classification model to reduce the interference of noise data on data processing. The least square method is used to further filter the interference data in the abnormal incremental data of the network, and obtain the quantized data stream. Statistic network abnormal data frequent pattern data set, on this basis, adopt improved clustering method to complete the mining of network abnormal data stream. The experimental results show that the highest anti-noise coefficient of the proposed method is 0.7, and the data mining time is shorter, and the data mining accuracy is higher, which fully verifies the data stream mining performance of the method.
Similar content being viewed by others
References
Cheng, K.C., Huang, M.J., Fu, C.K., Wang, K.H., Wang, H.M., Lin, L.H.: Establishing a multiple-criteria decision-making model for stock investment decisions using data mining techniques. Sustainability 13(6), 3100 (2021)
Mansouri, N., Javidi, M.M., Zade, B.: Using data mining techniques to improve replica management in cloud environment. Soft. Comput. 24(10), 7335–7360 (2020)
Liu, S., Sun, L., Zhu, S., Li, J., Chen, X., Zhong, W.: Operation strategy optimization of desulfurization system based on data mining. Appl. Math. Model. 81(5), 144–158 (2020)
Luo, Z., Hong, S.H., Ding, Y.M.: A data mining-driven incentive-based demand response scheme for a virtual power plant. Appl. Energy 239(4), 549–559 (2019)
Wu, J.J.: Data mining method for abnormal nodes of high load grating sensing network. Laser J. 40(02), 68–72 (2019)
Han, W.B.: Simulation of accurate mining for non-uniform sampling data in open network. Comput. Simul. 37(08), 343–394 (2020)
Tian, H., He, Y.: Big data mining based on around-centroid clustering algorithm. Appl. Res. Comput. 350(12), 72–75 (2020)
Yu, W.: Discovering frequent movement paths from taxi trajectory data using spatially embedded networks and association rules. IEEE Trans. Intell. Transp. Syst. 20(3), 855–866 (2019)
Joo, S., Lu, K., Lee, T.: Analysis of content topics, user engagement and library factors in public library social media based on text mining. Online Inf. Rev. 44(1), 258–277 (2020)
Xia, D., Ning, F., He, W.: Research on parallel adaptive canopy-K-means clustering algorithm for big data mining based on cloud platform. J. Grid Comput. 18(2), 263–273 (2020)
Cominola, A., Nguyen, K., Giuliani, M., Stewart, R.A., Maier, H.R., Castelletti, A.: Data mining to uncover heterogeneous water use behaviors from smart meter data. Water Resour. Res. 55(11), 9315–9333 (2019)
Wu, Y., Liu, Y., Ahmed, S.H., Peng, J., El-Latif, A.A.: Dominant data set selection algorithms for electricity consumption time-series data analysis based on affine transformation. IEEE Internet Things J. 7(5), 4347–4360 (2020). https://doi.org/10.1109/jiot.2019.2946753
Zhang, J.: Interaction design research based on large data rule mining and blockchain communication technology. Soft. Comput. 24(21), 16593–16604 (2020)
El-Latif, A.A., Abd-El-Atty, B., Venegas-Andraca, S.E., Mazurczyk, W.: Efficient quantum-based security protocols for information sharing and data protection in 5G networks. Futur. Gener. Comput. Syst. 100, 893–906 (2019). https://doi.org/10.1016/j.future.2019.05.053
Zhou, X., Huang, L., Zhang, Y., Yu, M.: A hybrid approach to detecting technological recombination based on text mining and patent network analysis. Scientometrics 121(2), 699–737 (2019)
Wang, Y., Ye, H., Zhang, T., Zhang, H.: A data mining method based on unsupervised learning and spatiotempporal analysis for sheath current monitoring. Neurocomputing 352(8), 54–63 (2019)
Griffiths, D., Boehm, J.: A review on deep learning techniques for 3D sensed data classification. Remote Sens. 11(12), 1499 (2019)
Mathan, K., et al.: A novel gini index decision tree data mining method with neural network classifiers for prediction of heart disease. Des. Autom. Embed. Syst. 22(3), 225–242 (2018). https://doi.org/10.1007/s10617-018-9205-4
Zhang, X., Wang, D., Zhou, Y., Chen, H., Cheng, F., Liu, M.: Kernel modified optimal margin distribution machine for imbalanced data classification. Patt. Recogn. Lett. 125(6), 325–332 (2019)
Nguyen, N.-T., Leu, M.C., Zeadally, S., Liu, B.-H., Chu, S.-I.: Optimal solution for data collision avoidance in radio frequency identification networks. Internet Technol. Lett. 1, e49 (2018). https://doi.org/10.1002/itl2.49
Hammad, M., Alkinani, M.H., Gupta, B.B., El-Latif, A.A.: Myocardial infarction detection based on deep neural network on imbalanced data. Multimedia Syst. (2021). https://doi.org/10.1007/s00530-020-00728-8
Rojas, J., Marin, C.E., García, P.A., Forero, J., Crespo, R.G.: Analysis of physico-chemical variables and their influence on water quality of the Bogota River using data mining. Int. J. High Perform. Syst. Archit. 8(1/2), 3 (2018). https://doi.org/10.1504/ijhpsa.2018.10015187
Gomathi, N., Karlekar, N.P.: Ontology and hybrid optimization based SVNN for privacy preserved medical data classification in cloud. Int. J. Artif. Intell. Tools 28(3), 1950009 (2019)
Khan, N., Anwar, S.: Time-domain data fusion using weighted evidence and Dempster-Shafer combination rule: application in object classification. Sensors 19(23), 5187 (2019)
Lan, Z.W., Yuan, J., Ren, Z.K.: Big data mining method for intrusion monitoring of multi-source communication research and development institutions. Comput. Simul. 38(01), 350–353 (2021)
Rajakumari, K., Punitha P., Lakshmana Kumar, R., Suresh, C.: Improvising packet delivery and reducing delay ratio in mobile ad hoc network using neighbor coverage-based topology control algorithm. Int. J. Commun. Syst. (2019)
Sathishkumar, V.E, Park, J., Cho, Y.: Seoul bike trip duration prediction using data mining techniques. IET Intel. Transport Syst. 14(11), 1465–1474 (2020). https://doi.org/10.1049/iet-its.2019.0796
Gao, J., Wang, H., Shen, H.: Task failure prediction in cloud data centers using deep learning. IEEE Trans. Serv. Comput. (2020). https://doi.org/10.1109/tsc.2020.2993728
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jia, X. Research on network abnormal data flow mining based on improved cluster analysis. Distrib Parallel Databases 40, 797–813 (2022). https://doi.org/10.1007/s10619-021-07353-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-021-07353-y