Skip to main content

The Deep Learning Modules for Cyberattack Identification in NetFlow Data Log with Ceph

  • Conference paper
  • First Online:
Frontier Computing (FC 2019)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 551))

Included in the following conference series:

  • 114 Accesses

Abstract

In today’s fast-moving information era, there is no doubt that the Internet has become an indispensable part of human life. However, in the world of the Internet, it also hides unusual network behavior. Find the hidden unusual network behavior can reduce the vulnerability in the network. This paper proposes a complete architecture to store and analyze the collected network log data. We process and integrate the network data collected by each router on the campus, and store the integrated data in the Ceph storage. Ceph distributed storage environment with open source, high performance, high reliability and scalability, and preliminary preprocessing of raw materials through Python, eliminating redundant fields and unit unification. The collated data set is divided into two parts analysis, and part of the abnormal analysis is part of attack identification. In the sub-analysis, we find the abnormal data period and total flow through the standard deviation of three standard deviations. Moreover, we use Keras to identify the real-time data obtained by a cyberattack, establish an automatic identification model through the recurrent neural network (RNN), an experiment and adjust various parameters without affecting the accuracy. Further, optimize the RNN automated identification model. The identification accuracy of the optimization model in attack identification is about 98%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI 2006), Berkeley, CA, USA, pp. 307–320. USENIX Association (2006)

    Google Scholar 

  2. Ceph. https://ceph.com/ (2019)

  3. Weil, S.A., Brandt, S.A., Miller, E.L., Maltzahn, C.: CRUSH: controlled, scalable, decentralized placement of replicated data. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC 2006), p. 31, November 2006

    Google Scholar 

  4. Wikipedia Contributors: Recurrent neural network—Wikipedia, the free encyclopedia (2019). https://en.wikipedia.org/w/index.php?title=Recurrent_neural_network&oldid=889817533. Accessed 16 Apr 2019

  5. Keras. https://keras.io/ (2019)

  6. Liu, H., Lang, B., Liu, M., Yan, H.: CNN and RNN based payload classification methods for attack detection. Knowl.-Based Syst. 163, 332–341 (2019)

    Article  Google Scholar 

  7. Terzi, D.S., Terzi, R., Sagiroglu, S.: Big data analytics for network anomaly detection from netflow data. In: 2017 International Conference on Computer Science and Engineering (UBMK), pp. 592–597, October 2017

    Google Scholar 

  8. Hofstede, R., Čeleda, P., Trammell, B., Drago, I., Sadre, R., Sperotto, A., Pras, A.: Flow monitoring explained: from packet capture to data analysis with NetFlow and IPFIX. IEEE Commun. Surv. Tutor. 16(4), 2037–2064 (2014). (Fourthquarter)

    Article  Google Scholar 

  9. Wikipedia Contributors: VMWare ESXi—Wikipedia, the free encyclopedia (2019). https://en.wikipedia.org/w/index.php?title=VMware_ESXi&oldid=891722497. Accessed 16 Apr 2019

  10. Scikit Learn: LabelEncoder (2019). https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html. Accessed 16 Apr 2019

  11. Scikit Learn: OneHotEncoder (2019). https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html. Accessed 16 Apr 2019

  12. Estevez-Tapiador, J.M., Garcia-Teodoro, P., Diaz-Verdejo, J.E.: Anomaly detection methods in wired networks: a survey and taxonomy. Comput. Commun. 27(16), 1569–1584 (2004)

    Article  Google Scholar 

  13. Ramakrishnan, N., Soni, T.: Network traffic prediction using recurrent neural networks. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 187–193, December 2018

    Google Scholar 

Download references

Acknowledgment

This work was sponsored by the Ministry of Science and Technology (MOST), Taiwan, under Grant No. 107-2221-E-029-008 and 107-2218-E-029-003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Tung Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, ML., Yang, CT., Kristiani, E., Liu, JC. (2020). The Deep Learning Modules for Cyberattack Identification in NetFlow Data Log with Ceph. In: Hung, J., Yen, N., Chang, JW. (eds) Frontier Computing. FC 2019. Lecture Notes in Electrical Engineering, vol 551. Springer, Singapore. https://doi.org/10.1007/978-981-15-3250-4_37

Download citation

Publish with us

Policies and ethics