Skip to main content
Log in

An Anomaly Data Mining Method for Mass Sensor Networks Using Improved PSO Algorithm Based on Spark Parallel Framework

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Accurate detection and capture of anomaly data in complex network data stream is an important part of ensuring network security. Traditional methods cannot adapt to the high dynamic changes of abnormal data characteristics in complex network. Thus, the detection accuracy is reduced. In this paper, a k-means parallel clustering algorithm is proposed. It is optimized by particle swarm optimization with dynamic adaptive inertia weight (dsPSOK-means). And it is used to mine the anomaly data for mass sensor networks. The inertia weight is dynamically adjusted through the fitness function, so that the dsPSO algorithm has the adaptive characteristics. Then, the output of the dsPSO algorithm is taken as the input of the k-means algorithm. Thus, the intelligence and self-adaptability of the k-means algorithm in selecting the initial center point is improved. Finally, with the help of Spark platform, the parallelization of dsPSOK-means clustering algorithm in the clustering environment is designed and implemented. It is shown by the experimental results that the traffic among nodes in the execution process can be effectively reduced by the dsPSOK-means algorithm. And the accuracy of abnormal data mining in complex network data flow is 5% higher than that of the comparison algorithm on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abdel-Hamid, N.B., Elghamrawy, S., Desouky, A.E., et al.: A dynamic spark-based classification framework for imbalanced big data. Journal of Grid Computing. 16(3), 1–20 (2018)

    Google Scholar 

  2. Tan, X., Ji, Z., Zhang, Y.: Non-invasive continuous blood pressure measurement based on mean impact value method, BP neural network, and genetic algorithm. Technology & Health Care Official Journal of the European Society for Engineering & Medicine. 26(6), 1–15 (2018)

    Google Scholar 

  3. Li, D.J., Li, Y.Y., Li, J.X., et al.: Gesture recognition based on BP neural network improved by chaotic genetic algorithm. Int J Autom Comput. 15(3), 1–10 (2018)

    Article  Google Scholar 

  4. Bao H , Wang Y (2017) A C-SVM Based Anomaly Detection Method for Multi-Dimensional Sequence over Data Stream. IEEE International Conference on Parallel & Distributed Systems. IEEE

  5. Yahyaoui A, Abdellatif T, Attia R (2018) READ: Reliable Event and Anomaly Detection System in Wireless Sensor Networks. 2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)

  6. Ramirez-Gallego, S., Krawczyk, B., Garcia, S., et al.: Nearest Neighbor Classification for High-Speed Big Data Streams Using Spark. IEEE Transactions on Systems Man & Cybernetics Systems. PP(99), 1–13 (2017)

    Google Scholar 

  7. Latiff, N.M.A., Nikabdmalik, N.N., Latiff, A.H.A.: A green clustering protocol for Mobile sensor network using particle swarm optimization. Journal of Electronic Science & Technology. 14(2), 160–169 (2016)

    Google Scholar 

  8. Hou, C., Xiao, Y., Cao, Y., et al.: Prediction of synchronous closing time of permanent magnetic actuator for vacuum circuit breaker based on PSO-BP. IEEE Transactions on Dielectrics & Electrical Insulation. 24(6), 3321–3326 (2018)

    Article  Google Scholar 

  9. Gill, S.S., Buyya, R.: Resource provisioning based scheduling framework for execution of heterogeneous and clustered workloads in clouds: from fundamental to autonomic offering. Journal of Grid Computing. 16(1), 1–33 (2018)

    Article  Google Scholar 

  10. Sahal, R., Nihad, M., Khafagy, M.H., et al.: iHOME: index-based JOIN query optimization for limited big data storage. Journal of Grid Computing. 16(2), 1–36 (2018)

    Article  Google Scholar 

  11. Wang, J., Li, G., Peng, P., et al.: Semi-supervised semantic factorization hashing for fast cross-modal retrieval. Multimed Tools Appl. 76(3), 1–19 (2017)

    Google Scholar 

  12. Zhen, L., Luo, L., Liu, Y., et al.: UHF partial discharge localization algorithm based on compressed sensing. IEEE Transactions on Dielectrics & Electrical Insulation. 25(1), 21–29 (2018)

    Article  Google Scholar 

  13. Lin, T.Y., Santoso, H.A., Wu, K.R.: Global sensor deployment and local coverage-aware recovery schemes for smart environments. Mobile Computing IEEE Transactions on. 14(7), 1382–1396 (2015)

    Article  Google Scholar 

  14. Nammalvar, P., Ramkumar, S.: Parameter improved particle swarm optimization based direct-current vector control strategy for solar PV system. Advances in Electrical & Computer Engineering. 18(1), 105–112 (2018)

    Article  Google Scholar 

  15. Cai, Z.: Lee, Ivan, Chu, Shu-Chuan, et al. SimSim: a service discovery method preserving content similarity and spatial similarity in P2P Mobile cloud. Journal of Grid Computing. 17(1), 79–95 (2019)

    Article  Google Scholar 

  16. Righi, R.D.R., Lehmann, M., Gomes, M.M., et al.: A survey on global management view: toward combining system monitoring, resource management, and load prediction. Journal of Grid Computing. 17(9), 1–30 (2019)

    Google Scholar 

  17. Houssein, E.H., Ewees, A.A., Elaziz, M.A.: Improving twin support vector machine based on hybrid swarm optimizer for heartbeat classification. Pattern Recognition & Image Analysis. 28(2), 243–253 (2018)

    Article  Google Scholar 

  18. Hinz, M., Koslovski, G.P., Miers, C.C., et al.: A cost model for IaaS clouds based on virtual machine energy consumption. Journal of Grid Computing. 16(3), 493–512 (2018)

    Article  Google Scholar 

  19. Bartoletti, M., Bellomy, B., Pompianu, L.: A journey into Bitcoin metadata. Journal of Grid Computing. 17(3), 3–22 (2019)

    Article  Google Scholar 

  20. Chunlin, L., Jianhang, T., Youlong, L.: Hybrid cloud adaptive scheduling strategy for heterogeneous workloads. Journal of Grid Computing. 17(4), 1–28 (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Guangdong Province Ministry of education industry university research integration project (No.2012B091100288).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingzhen Yuan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, J. An Anomaly Data Mining Method for Mass Sensor Networks Using Improved PSO Algorithm Based on Spark Parallel Framework. J Grid Computing 18, 251–261 (2020). https://doi.org/10.1007/s10723-020-09505-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-020-09505-3

Keywords

Navigation