Malicious User Profiling Using Honeypots and Deep Learning

Levin, Roy; Scherman, Mathias; Matatov, Hana

doi:10.1007/978-3-030-98015-3_60

Roy Levin¹⁰,
Mathias Scherman¹⁰ &
Hana Matatov¹¹

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 439))

Included in the following conference series:

Future of Information and Communication Conference

1576 Accesses

Abstract

Many cloud resources such as virtual machines are managed through remote access. This makes management a lot more convenient however, it also increases exposure to cyber-attacks and specifically to generic attacks such as Bitcoin mining, spamming, ransomware, installing backdoors, etc. With the proliferation of such attacks, their detection has become of major importance. In this paper we present a new supervised learning technique developed to learn and detect the patterns behind these kind of attacks. The goal is to continuously distinguish between benign versus malicious SSH logon sessions. We formulate this as a classification task where a model is trained on benign and malicious SSH sessions. The benign sessions are collected from security hardened machines lacking any attack indicators and the malicious sessions are gathered from dedicated Honeypot machines setup for the sole purpose of luring attackers. As the Honeypots are not actually a part of any real network, only generic attackers log onto them, usually by sweeping IP addresses and guessing passwords. We then train a Deep Neural Net (DNN) to classify the sessions as benign or malicious. Our experiments show that the Average Precision (AP) of this model reaches up to 99%. We also show that simpler ML models achieve AP that is significantly lower. This indicates that learning the attack patterns is not a task which can effectively be mastered by traditional models, hence it is not a trivial task. In addition to statistical measures, we also analyze and present sessions from customer VMs which have been surfaced by the DNN. We manually examined these sessions to show that most of them actually require the attention of security professionals. Among these sessions we witnessed typical attack patterns which include reconnaissance, running Bitcoin miners, ransomware and other suspicious processes on target machines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abirami, A.M., Gayathri, V.: A survey on sentiment analysis methods and approach. In: 2016 International Conference on Advanced Computing, pp. 72–76, January 2017
Google Scholar
Alsaheel, A.: ATLAS: a sequence-based learning approach for attack investigation. In: 30th USENIX Security Symposium, August 2021
Google Scholar
Ricardo, A.: Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc, Boston, MA, USA (1999)
Google Scholar
Chen, Y.: Convolutional neural network for sentence classification. Master’s thesis (2015)
Google Scholar
Cozzi, E., Graziano, M., Fratantonio, Y., Balzarotti, D.: Understanding linux malware. In: 2018 IEEE symposium on security and privacy (SP), pp. 161–175. IEEE (2018)
Google Scholar
Du, M., Li, F., Zheng, G., Srikumar, V.: Deeplog: anomaly detection and diagnosis from system logs through deep learning, pp. 1285-1298. Association for Computing Machinery, New York (2017)
Google Scholar
Hendler, D., Kels, S., Rubin, A.: Detecting malicious powershell commands using deep neural networks. In: Proceedings of the Asia Conference on Computer and Communications Security, pp. 187–197 (2018)
Google Scholar
McCallum, A., Nigam, K., et al.: A comparison of event models for naive bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48 (1998)
Google Scholar
Microsoft. Azure data science node details (2018)
Google Scholar
Microsoft. Azure hdinsight node details (2018)
Google Scholar
Microsoft. Operations management suite (2018)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Google Scholar
Rubin, A., Kels, S., Hendler, D.: Amsi-based detection of malicious powershell code using contextual embeddings. In: The 15th ACM Asia Conference on Computer and Communications Security (2019)
Google Scholar
Ugarte, D., Maiorca, D., Cara, F., Giacinto, G.: Powerdrive: accurate de-obfuscation and analysis of powershell malware. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 240–259. Springer, Cham (2019)
Google Scholar
Wang, Q., et al.: You are what you do: Hunting stealthy malware via data provenance analysis. In: Symposium on Network and Distributed System Security (NDSS) (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft, Albuquerque, USA
Roy Levin & Mathias Scherman
Technion – Israel Institute of Technology, Haifa, Israel
Hana Matatov

Authors

Roy Levin
View author publications
You can also search for this author in PubMed Google Scholar
Mathias Scherman
View author publications
You can also search for this author in PubMed Google Scholar
Hana Matatov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roy Levin .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Levin, R., Scherman, M., Matatov, H. (2022). Malicious User Profiling Using Honeypots and Deep Learning. In: Arai, K. (eds) Advances in Information and Communication. FICC 2022. Lecture Notes in Networks and Systems, vol 439. Springer, Cham. https://doi.org/10.1007/978-3-030-98015-3_60

Download citation

DOI: https://doi.org/10.1007/978-3-030-98015-3_60
Published: 12 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98014-6
Online ISBN: 978-3-030-98015-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics