Abstract
Accurate detection and capture of anomaly data in complex network data stream is an important part of ensuring network security. Traditional methods cannot adapt to the high dynamic changes of abnormal data characteristics in complex network. Thus, the detection accuracy is reduced. In this paper, a k-means parallel clustering algorithm is proposed. It is optimized by particle swarm optimization with dynamic adaptive inertia weight (dsPSOK-means). And it is used to mine the anomaly data for mass sensor networks. The inertia weight is dynamically adjusted through the fitness function, so that the dsPSO algorithm has the adaptive characteristics. Then, the output of the dsPSO algorithm is taken as the input of the k-means algorithm. Thus, the intelligence and self-adaptability of the k-means algorithm in selecting the initial center point is improved. Finally, with the help of Spark platform, the parallelization of dsPSOK-means clustering algorithm in the clustering environment is designed and implemented. It is shown by the experimental results that the traffic among nodes in the execution process can be effectively reduced by the dsPSOK-means algorithm. And the accuracy of abnormal data mining in complex network data flow is 5% higher than that of the comparison algorithm on average.
Similar content being viewed by others
References
Abdel-Hamid, N.B., Elghamrawy, S., Desouky, A.E., et al.: A dynamic spark-based classification framework for imbalanced big data. Journal of Grid Computing. 16(3), 1–20 (2018)
Tan, X., Ji, Z., Zhang, Y.: Non-invasive continuous blood pressure measurement based on mean impact value method, BP neural network, and genetic algorithm. Technology & Health Care Official Journal of the European Society for Engineering & Medicine. 26(6), 1–15 (2018)
Li, D.J., Li, Y.Y., Li, J.X., et al.: Gesture recognition based on BP neural network improved by chaotic genetic algorithm. Int J Autom Comput. 15(3), 1–10 (2018)
Bao H , Wang Y (2017) A C-SVM Based Anomaly Detection Method for Multi-Dimensional Sequence over Data Stream. IEEE International Conference on Parallel & Distributed Systems. IEEE
Yahyaoui A, Abdellatif T, Attia R (2018) READ: Reliable Event and Anomaly Detection System in Wireless Sensor Networks. 2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)
Ramirez-Gallego, S., Krawczyk, B., Garcia, S., et al.: Nearest Neighbor Classification for High-Speed Big Data Streams Using Spark. IEEE Transactions on Systems Man & Cybernetics Systems. PP(99), 1–13 (2017)
Latiff, N.M.A., Nikabdmalik, N.N., Latiff, A.H.A.: A green clustering protocol for Mobile sensor network using particle swarm optimization. Journal of Electronic Science & Technology. 14(2), 160–169 (2016)
Hou, C., Xiao, Y., Cao, Y., et al.: Prediction of synchronous closing time of permanent magnetic actuator for vacuum circuit breaker based on PSO-BP. IEEE Transactions on Dielectrics & Electrical Insulation. 24(6), 3321–3326 (2018)
Gill, S.S., Buyya, R.: Resource provisioning based scheduling framework for execution of heterogeneous and clustered workloads in clouds: from fundamental to autonomic offering. Journal of Grid Computing. 16(1), 1–33 (2018)
Sahal, R., Nihad, M., Khafagy, M.H., et al.: iHOME: index-based JOIN query optimization for limited big data storage. Journal of Grid Computing. 16(2), 1–36 (2018)
Wang, J., Li, G., Peng, P., et al.: Semi-supervised semantic factorization hashing for fast cross-modal retrieval. Multimed Tools Appl. 76(3), 1–19 (2017)
Zhen, L., Luo, L., Liu, Y., et al.: UHF partial discharge localization algorithm based on compressed sensing. IEEE Transactions on Dielectrics & Electrical Insulation. 25(1), 21–29 (2018)
Lin, T.Y., Santoso, H.A., Wu, K.R.: Global sensor deployment and local coverage-aware recovery schemes for smart environments. Mobile Computing IEEE Transactions on. 14(7), 1382–1396 (2015)
Nammalvar, P., Ramkumar, S.: Parameter improved particle swarm optimization based direct-current vector control strategy for solar PV system. Advances in Electrical & Computer Engineering. 18(1), 105–112 (2018)
Cai, Z.: Lee, Ivan, Chu, Shu-Chuan, et al. SimSim: a service discovery method preserving content similarity and spatial similarity in P2P Mobile cloud. Journal of Grid Computing. 17(1), 79–95 (2019)
Righi, R.D.R., Lehmann, M., Gomes, M.M., et al.: A survey on global management view: toward combining system monitoring, resource management, and load prediction. Journal of Grid Computing. 17(9), 1–30 (2019)
Houssein, E.H., Ewees, A.A., Elaziz, M.A.: Improving twin support vector machine based on hybrid swarm optimizer for heartbeat classification. Pattern Recognition & Image Analysis. 28(2), 243–253 (2018)
Hinz, M., Koslovski, G.P., Miers, C.C., et al.: A cost model for IaaS clouds based on virtual machine energy consumption. Journal of Grid Computing. 16(3), 493–512 (2018)
Bartoletti, M., Bellomy, B., Pompianu, L.: A journey into Bitcoin metadata. Journal of Grid Computing. 17(3), 3–22 (2019)
Chunlin, L., Jianhang, T., Youlong, L.: Hybrid cloud adaptive scheduling strategy for heterogeneous workloads. Journal of Grid Computing. 17(4), 1–28 (2019)
Acknowledgements
This work was supported by the Guangdong Province Ministry of education industry university research integration project (No.2012B091100288).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yuan, J. An Anomaly Data Mining Method for Mass Sensor Networks Using Improved PSO Algorithm Based on Spark Parallel Framework. J Grid Computing 18, 251–261 (2020). https://doi.org/10.1007/s10723-020-09505-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-020-09505-3