Skip to main content
Log in

Uit-DGAdetector: detect domains generated by algorithms using machine learning

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Recent developments in information technology have brought numerous benefits but have also created risks for information security. One notable threat is the domain generated by the algorithm (DGA) technique used by botnets, which allows them to automatically generate and register multiple domains to evade detection and control from network security systems. To address this issue, we conducted research on a domain classification model specific to botnet-generated domains. We developed three domain classification models: bigrams, long short-term memory networks (LSTM), and a combination of LSTM and one-hot encoding. In this study, we implemented an ensemble model using a domain classification system, named UIT-DGADetector. To optimize the system, we employed Kafka to queue and streamline the requests, thereby reducing the load on the classification server. The deployed system operates well and achieves a high accuracy rate in predicting the domain types. However, this model still has limitations in predicting Word-based DGA botnets. The process must be optimized to reduce the waiting time in the queue. This study aims to contribute to network security and information protection, particularly by addressing the issue of DGA botnets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33

Similar content being viewed by others

Data availability

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Nasir, M.H., Arshad, J., Khan, M.M.: Collaborative device-level botnet detection for internet of things. Comput. Secur. 129, 103172 (2023)

    Article  Google Scholar 

  2. Alaeiyan, M., Parsa, S., Vinod, P., Conti, M.: Detection of algorithmically-generated domains: an adversarial machine learning approach. Comput. Commun. 160, 661–673 (2020)

    Article  Google Scholar 

  3. Gaonkar, S., Dessai, N.F., Costa, J., Borkar, A., Aswale, S., Shetgaonkar, P.: A survey on botnet detection techniques. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). IEEE, pp. 1–6 (2020)

  4. Jayalaxmi, P., Kumar, G., Saha, R., Conti, M., Kim, T.-H., Thomas, R.: Debot: a deep learning-based model for bot detection in industrial internet-of-things. Comput. Electr. Eng. 102, 108214 (2022)

    Article  Google Scholar 

  5. Yu, B., Pan, J., Hu, J., Nascimento, A., De Cock, M.: Character level based detection of dga domain names. In: international joint conference on neural networks (IJCNN). IEEE, 2018, pp. 1–8 (2018)

  6. Almuhaideb, A.M., Alynanbaawi, D.Y.: Applications of artificial intelligence to detect android botnets: a survey. IEEE Access 10, 71737–71748 (2022)

    Article  Google Scholar 

  7. Alani, M.M.: Botstop: packet-based efficient and explainable iot botnet detection using machine learning. Comput. Commun. 193, 53–62 (2022)

    Article  Google Scholar 

  8. Mousavi, S., Khansari, M., Rahmani, R.: A fully scalable big data framework for botnet detection based on network traffic analysis. Inf. Sci. 512, 629–640 (2020)

    Article  Google Scholar 

  9. Durmaz, A.E.: Dga classification and detection for automated malware analysis (2017). [Online]. Available: https://cyber.wtf/2017/08/30/dga-classification-and-detection-for-automated-malware-analysis/

  10. Hoang, X.D., Nguyen, Q.C.: Botnet detection based on machine learning techniques using dns query data. Fut. Internet 10(5), 43 (2018)

    Article  Google Scholar 

  11. Alieyan, K., ALmomani, A., Manasrah, A., Kadhum, M.M.: A survey of botnet detection based on dns. Neural Comput. Appl. 28, 1541–1558 (2017)

    Article  Google Scholar 

  12. Hanafi, A.V., Ghaffari, A., Rezaei, H., Valipour, A., Arasteh, B.: Intrusion detection in internet of things using improved binary golden jackal optimization algorithm and lstm. Cluster Comput. pp. 1–18 (2023)

  13. Tuan, T.A., Long, H.V., Taniar, D.: On detecting and classifying dga botnets and their families. Comput. Secur. 113, 102549 (2022)

    Article  Google Scholar 

  14. Yun, X., Huang, J., Wang, Y., Zang, T., Zhou, Y., Zhang, Y.: Khaos: an adversarial neural network dga with high anti-detection ability. IEEE Trans. Inf. Foren. Secur. 15, 2225–2240 (2019)

    Article  Google Scholar 

  15. Kara, I., Ok, M., Ozaday, A.: Characteristics of understanding urls and domain names features: the detection of phishing websites with machine learning methods. IEEE Access 10, 124420–124428 (2022)

    Article  Google Scholar 

  16. Zhu, Y., Cui, L., Ding, Z., Li, L., Liu, Y., Hao, Z.: Black box attack and network intrusion detection using machine learning for malicious traffic. Comput. Secur. 123, 102922 (2022)

    Article  Google Scholar 

  17. Gibert, D., Mateu, C., Planes, J.: The rise of machine learning for detection and classification of malware: research developments, trends and challenges. J. Netw. Comput. Appl. 153, 102526 (2020)

    Article  Google Scholar 

  18. Kalakoti, R., Nõmm, S., Bahsi, H.: In-depth feature selection for the statistical machine learning-based botnet detection in iot networks. IEEE Access 10, 94518–94535 (2022)

    Article  Google Scholar 

  19. Zeidanloo, H.R., Shooshtari, M.J.Z., Amoli, P.V., Safari, M., Zamani, M.: A taxonomy of botnet detection techniques. In: 2010 3rd International Conference on Computer Science and Information Technology, vol. 2, pp. 158–162. IEEE (2010)

  20. Highnam, K., Puzio, D., Luo, S., Jennings, N.R.: Real-time detection of dictionary dga network traffic using deep learning. SN Comput. Sci. 2(2), 110 (2021)

    Article  Google Scholar 

  21. Mughaid, A., AlZu’bi, S., Hnaif, A., Taamneh, S., Alnajjar, A., Elsoud, E.A.: An intelligent cyber security phishing detection system using deep learning techniques. Clust. Comput. 25(6), 3819–3828 (2022)

    Article  Google Scholar 

  22. Hu, X., Chen, H., Li, M., Cheng, G., Li, R., Wu, H., Yuan, Y.: Replacedga: Bilstm based adversarial dga with high anti-detection ability. IEEE Trans. Inform. Foren. Secur. (2023)

  23. Wang, S., Sun, L., Qin, S., Li, W., Liu, W.: Krtunnel: Dns channel detector for mobile devices. Comput. Secur. 120, 102818 (2022)

    Article  Google Scholar 

  24. Wang, T.-S., Lin, H.-T., Cheng, W.-T., Chen, C.-Y.: Dbod: clustering and detecting dga-based botnets using dns traffic analysis. Comput. Secur. 64, 1–15 (2017)

    Article  Google Scholar 

  25. Zago, M., Pérez, M.G., Pérez, G.M.: Umudga: a dataset for profiling dga-based botnet. Comput. Secur. 92, 101719 (2020)

    Article  Google Scholar 

  26. Fu, Y., Yu, L., Hambolu, O., Ozcelik, I., Husain, B., Sun, J., Sapra, K., Du, D., Beasley, C.T., Brooks, R.R.: Stealthy domain generation algorithms. IEEE Trans. Inf. Foren. Secur. 12(6), 1430–1443 (2017)

    Article  Google Scholar 

  27. Liang, J., Chen, S., Wei, Z., Zhao, S., Zhao, W.: Hagdetector: heterogeneous dga domain name detection model. Comput. Secur. 120, 102803 (2022)

    Article  Google Scholar 

  28. Motylinski, M., MacDermott, Á., Iqbal, F., Shah, B.: A gpu-based machine learning approach for detection of botnet attacks. Comput. Secur. 123, 102918 (2022)

    Article  Google Scholar 

  29. Chiba, D., Akiyama, M., Yagi, T., Hato, K., Mori, T., Goto, S.: Domainchroma: Building actionable threat intelligence from malicious domain names. Computers & Security 77, 138–161 (2018)

    Article  Google Scholar 

  30. Almashhadani, A.O., Kaiiali, M., Carlin, D., Sezer, S.: Maldomdetector: a system for detecting algorithmically generated domain names with machine learning. Comput. Secur. 93, 101787 (2020)

    Article  Google Scholar 

  31. Logistic Regression in Machine Learning - Javatpoint - javatpoint.com (2021). https://www.javatpoint.com/logistic-regression-in-machine-learning, [Accessed 09-06-2023]

  32. Foroozan Yazdani, S., Tan, Z., Kakavand, M., Mustapha, A.: Ngrampos: a bigram-based linguistic and statistical feature process model for unstructured text classification. Wirel. Netw. 1–11 (2022)

  33. Cucchiarelli, A., Morbidoni, C., Spalazzi, L., Baldi, M.: Algorithmically generated malicious domain names detection based on n-grams features. Expert Syst. Appl. 170, 114551 (2021)

    Article  Google Scholar 

  34. Aydın, H., Orman, Z., Aydın, M.A.: A long short-term memory (lstm)-based distributed denial of service (ddos) detection and defense system design in public cloud network environment. Comput. Secur. 118, 102725 (2022)

    Article  Google Scholar 

  35. Understanding LSTM Networks - colah’s blog - colah.github.io (2015). https://colah.github.io/posts/2015-08-Understanding-LSTMs, [Accessed 09-06-2023]

  36. Tran, D., Mac, H., Tong, V., Tran, H.A., Nguyen, L.G.: A lstm based framework for handling multiclass imbalance in dga botnet detection. Neurocomputing 275, 2401–2413 (2018)

    Article  Google Scholar 

  37. Hyrum Anderson, J.W.: Using deep learning to detect DGAs - elastic.co (2016). https://www.elastic.co/blog/using-deep-learning-detect-dgas [Accessed 09-06-2023]

  38. Qiao, Y., Zhang, B., Zhang, W., Sangaiah, A.K., Wu, H.: Dga domain name classification method based on long short-term memory with attention mechanism. Appl. Sci. 9(20), 4205 (2019)

    Article  Google Scholar 

  39. Jafarzadehpour, F., Molahosseini, A.S., Zarandi, A.A.E., Sousa, L.: Efficient modular adder designs based on thermometer and one-hot coding. IEEE Trans. Very Large Scale Integr. (vlsi) Syst. 27(9), 2142–2155 (2019)

    Article  Google Scholar 

  40. Mestour, Z.: Domain Generation Algorithm - kaggle.com (2023). https://www.kaggle.com/datasets/slashtea/domain-generation-algorithm [Accessed 09-06-2023]

  41. Nowroozi, E., Mohammadi, M., Conti, M., et al.: An adversarial attack analysis on malicious advertisement url detection framework. IEEE Trans. Netw. Serv. Manag. (2022)

  42. Raptis, T.P., Passarella, A.: A survey on networked data streaming with apache kafka. IEEE Access (2023)

  43. Braunisch, N., Schlesinger, S., Lehmann, R.: Adaptive industrial iot gateway using kafka streaming platform. In: 2022 IEEE 20th International Conference on Industrial Informatics (INDIN), pp. 600–605. IEEE, (2022)

  44. Confluent: Quick Start for Confluent Platform | Confluent Documentation - docs.confluent.io (2023). https://docs.confluent.io/platform/current/platform-quickstart.html [Accessed 09-06-2023]

  45. Yu, B., Gray, D.L., Pan, J., De Cock, M., Nascimento, A.C.: Inline dga detection with deep networks. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 683–692. IEEE (2017)

  46. Givre, C.: DGA dataset - kaggle.com (2023). https://www.kaggle.com/datasets/gtkcyber/dga-dataset [Accessed 09-06-2023]

Download references

Acknowledgements

This research was supported by The VNUHCM-University of Information Technology’s Scientific Research Support Fund.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by NTC and NNM. The first draft of the manuscript was written by both authors and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nguyen Tan Cam.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cam, N.T., Man, N.N. Uit-DGAdetector: detect domains generated by algorithms using machine learning. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04363-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10586-024-04363-0

Keywords

Navigation