BlackEye: automatic IP blacklisting using machine learning from security logs

Jeon, Dooyong; Tak, Byungchul

doi:10.1007/s11276-019-02201-5

BlackEye: automatic IP blacklisting using machine learning from security logs

Published: 03 December 2019

Volume 28, pages 937–948, (2022)
Cite this article

Wireless Networks Aims and scope Submit manuscript

630 Accesses
9 Citations
2 Altmetric
Explore all metrics

Abstract

Blacklisting of malicious IP address is a primary technique commonly used for safeguarding mission-critical IT systems. The decision to blacklist an IP address requires careful examination of various aspects of packet traffic data as well as the behavioral history. Most of the current security monitoring for IP blacklisting heavily relies on the domain expertise from experienced specialists. Although there are efforts to apply machine-learning (ML) techniques to this problem, we are yet to see the mature solution. To mitigate these challenges and to gain better understanding of the problem, we have designed the BlackEye framework in which we can apply various ML techniques and produce models for accurate blacklisting. From our analysis results, we learn that multi-staged method that combines the data cleansing and the classification via logistic regression or random forest produces the best results. Our evaluation on the real-world data shows that it can reduce the the incorrect blacklisting by nearly 90% when compared to the performance of experts. More over, our proposed model performed well in terms of the time-to-blacklist by curtailing the period of malicious IP address in activity by 27 days on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey of intrusion detection systems: techniques, datasets and challenges

Article Open access 17 July 2019

Artificial intelligence in cyber security: research advances, challenges, and opportunities

Article 13 March 2021

Fighting against phishing attacks: state of the art and future challenges

Article 17 March 2016

References

Altay, B., Dokeroglu, T., & Cosar, A. (2019) . Context-sensitive and keyword density-based supervised machine learning techniques for malicious webpage detection. Soft Computing 23(12), 4177–4191. https://doi.org/10.1007/s00500-018-3066-4.
Article Google Scholar
Anand, A., Gorde, K., Antony Moniz, J. R., Park, N., Chakraborty, T., & Chu. B. (2018) . Phishing URL detection with oversampling based on text generative adversarial networks. In: 2018 IEEE international conference on Big Data (Big Data) (pp. 1168–1177). https://doi.org/10.1109/BigData.2018.8622547.
Arnaldo, I., Arun, A., Kyathanahalli, S., & Veeramachaneni, K. (2018) . Acquire, adapt, and anticipate: Continuous learning to block malicious domains. In: 2018 IEEE international conference on Big Data (Big Data) (pp. 1891–1898). https://doi.org/10.1109/BigData.2018.8622197.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305.
MathSciNet MATH Google Scholar
Coskun, B. (2017). (Un)Wisdom of crowds: Accurately spotting malicious IP clusters using not-so-accurate IP blacklists. IEEE Transactions on Information Forensics and Security, 12(6), 1406–1417.
Article Google Scholar
Cover, T., & Hart, P. (2006). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
Article Google Scholar
DShield. http://dshiled.org. Accessed 1 Dec 2019.
Du, M., Li, F., Zheng, G., Srikumar, V. (2017). DeepLog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. CCS ’17. Dallas, Texas, USA: ACM, 2017 (pp. 1285–1298). ISBN: 978-1-4503-4946-8.
Dua, S., & Du, X. (2011). Data mining and machine learning in Cybersecurity (1st ed.). Boston, MA: Auerbach Publications. ISBN: 1439839425, 9781439839423.
MATH Google Scholar
Fava, D. S., Byers, S. R., & Yang, S. J. (2008). Projecting cyberattacks through variable-length Markov models. IEEE Transactions on Information Forensics and Security, 3(3), 359–369.
Article Google Scholar
Lee, W. (1990). A data mining framework for constructing features and models for intrusion detection systems (computer security, network security). AAI9949009. Ph.D. thesis. New York, NY, USA, 1999. ISBN: 0-599-51249-0.
Lu, H., Li, Y., Mu, S., Wang, D., Kim, H., & Serikawa, S. (2018). Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet of Things Journal, 5(4), 2315–2322.
Article Google Scholar
Lu, H., Li, Y., Chen, M., Kim, H., & Serikawa, S. (2018). Brain intelligence: Go beyond artificial intelligence. Mobile Networks and Applications, 23(2), 368–375.
Article Google Scholar
Lu, H., Li, Y., Uemura, T., Ge, Z., Xu, X., He, L., et al. (2018). FDCNet: Filtering deep convolutional network for marine organism classification. Multimedia Tools and Applications, 77(17), 21847–21860. https://doi.org/10.1007/s11042-017-4585-1.
Article Google Scholar
Huimin, Lu, Li, Yujie, Uemura, Tomoki, Kim, Hyoungseop, & Serikawa, Seiichi. (2018). Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Generation Computer Systems, 82, 142–148.
Article Google Scholar
Ma, J., Saul, L. K., Savage, S., & Voelker, G. M. (2009). Beyond blacklists: Learning to detect malicious web sites from suspicious URLs. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’09. Paris: ACM, 2009 (pp. 1245–1254). ISBN: 978-1-60558-495-9.
Melis, L., Pyrgelis, A., & De Cristofaro, E. (2019). On collaborative predictive blacklisting. SIGCOMM Computer Communication Review, 48(5), 9–20. https://doi.org/10.1145/3310165.3310168.
Article Google Scholar
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Technical report 1999-66. Previous number = SIDL-WP-1999-0120. Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/. Accessed 1 Dec 2019.
Sahoo, D., Liu, C., & Hoi, S. C. H. (2017) . Malicious URL detection using machine learning: A survey. CoRR arXiv:1701.07179.
Soldo, F., Le, A., & Markopoulou, A. (2010). Predictive blacklisting as an implicit recommendation system. In Proceedings of the 29th conference on information communications. INFOCOM’10. San Diego, California, USA: IEEE Press, 2010 (pp. 1640–1648). ISBN: 978-1-4244-5836-3. http://dl.acm.org/citation.cfm?id=1833515.1833744
Tuor, A., Kaplan, S., Hutchinson, B., Nichols, N., & Robinson, S. (2017). Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In: CoRR. arXiv:1710.00811.
Xu, X., Lu, H., Song, J., Yang, Y., Shen, H. T., & Li, X. (2019). Ternary adversarial networks with self-supervision for zero-shot cross-modal retrieval. IEEE Transactions on Cybernetics,. https://doi.org/10.1109/TCYB.2019.2928180.
Article Google Scholar
Xu, X., He, L., Lu, H., Gao, L., & Ji, Y. (2019). Deep adversarial metric learning for cross-modal retrieval. WorldWideWeb, 22(2), 657–672. https://doi.org/10.1007/s11280-018-0541-x.
Article Google Scholar
Yang, P., Zhao, G., & Zeng, P. (2019). Phishing website detection based on multidimensional features driven by deep learning. IEEE Access, 7, 15196–15209. https://doi.org/10.1109/ACCESS.2019.2892066.
Article Google Scholar
Yen, T.-F., Oprea, A., Onarlioglu, K., Leetham, T., Robertson, W., Juels, A., & Kirda, E. (2013). Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In Proceedings of the 29th annual computer security applications conference. ACSAC ’13. ACM (pp. 199–208). ISBN: 978-1-4503-2015-3.
Zhang, J., Porras, P., & Ullrich, J. (2008). Highly predictive blacklisting. In Proceedings of the 17th conference on security symposium. SS’08. San Jose, CA: USENIX Association (pp. 107–122). http://dl.acm.org/citation.cfm?id=1496711.1496719.
Zhou, J., Heckman, M., Reynolds, B., Carlson, A., & Bishop, M. (2007). Modeling network intrusion detection alerts for correlation. ACM Transactions on Information and System Security, 10(1), 4.
Article Google Scholar

Download references

Acknowledgements

This research was supported by Kyungpook National University Research Fund, 2017.

Author information

Authors and Affiliations

Kyungpook National University, 80 Daehakro, Bukgu, Daegu, 41566, Korea
Dooyong Jeon & Byungchul Tak

Authors

Dooyong Jeon
View author publications
You can also search for this author in PubMed Google Scholar
Byungchul Tak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Byungchul Tak.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jeon, D., Tak, B. BlackEye: automatic IP blacklisting using machine learning from security logs. Wireless Netw 28, 937–948 (2022). https://doi.org/10.1007/s11276-019-02201-5

Download citation

Published: 03 December 2019
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11276-019-02201-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BlackEye: automatic IP blacklisting using machine learning from security logs

Abstract

Access this article

Similar content being viewed by others

Survey of intrusion detection systems: techniques, datasets and challenges

Artificial intelligence in cyber security: research advances, challenges, and opportunities

Fighting against phishing attacks: state of the art and future challenges

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

BlackEye: automatic IP blacklisting using machine learning from security logs

Abstract

Access this article

Similar content being viewed by others

Survey of intrusion detection systems: techniques, datasets and challenges

Artificial intelligence in cyber security: research advances, challenges, and opportunities

Fighting against phishing attacks: state of the art and future challenges

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation