Advertisement

Transfer Learning Approach for Identification of Malicious Domain Names

  • R. RajalakshmiEmail author
  • S. Ramraj
  • R. Ramesh Kannan
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 969)

Abstract

Malware domains generated by Domain Generated Algorithms (DGA) are highly dynamic in nature. The traditional approach of blacklisting the malicious domains is a time consuming approach and are not effective, as the DGA randomly generate the domain names for the malware. For real-time applications, malware detection is to be performed on the fly and hence sophisticated techniques are in demand to address this issue. Even though various machine learning techniques are employed for this purpose, the performance of such algorithms depends on how good the features are designed. In this work, we have proposed a transfer learning technique by combining the best performing Convolutional Neural Network with the machine learning algorithms such as Naive Bayes classifier for detection and classification of DGA generated domains. We have evaluated our approach using the dataset released by DMD 2018 Shared Task for both binary classification and multiclass classification scenario. Our methodology of CNN with NB for binary classification has been awarded the first rank in this DMD 2018 shared task.

Keywords

Malware detection DGA CNN Naïve Bayes classifier Transfer learning 

Notes

Acknowledgement

The authors would like to thank the management of Vellore Institute of Technology (VIT), Chennai for providing the support to carry out this research. We would also like to thank the Department of Science and Engineering Research Board (SERB), Government of India for their financial grant (Award No: ECR/2016/00484) for this research work.

References

  1. 1.
    Vinayakumar, R., Poornachandran, P., Soman, K.P.: Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Roy, S.S., Samui, P., Deo, R., Ntalampiras, S. (eds.) Big Data in Engineering Applications. SBD, vol. 44, pp. 113–142. Springer, Singapore (2018).  https://doi.org/10.1007/978-981-10-8476-8_6CrossRefGoogle Scholar
  2. 2.
    Vinayakumar, R., Soman, K., Poornachandran, P.: Detecting malicious domain names using deep learning approaches at scale. J. Intell. Fuzzy Syst. 34(3), 1355–1367 (2018)CrossRefGoogle Scholar
  3. 3.
    Vinayakumar, R., Soman, K., Poornachandran, P., Sachin Kumar, S.: Evaluating deep learning approaches to characterize and classify the DGAs at scale. J. Intell. Fuzzy Syst. 34(3), 1265–1276 (2018)CrossRefGoogle Scholar
  4. 4.
    Vinayakumar, R., Soman, K.P., Poornachandran, P., Menon, P.: A deep-dive on Machine learning for Cybersecurity use cases. In: Gupta, B., Sheng, M. (eds.) Machine Learning for Computer and Cyber Security: Principle, Algorithms, and Practices. CRC Press, USAGoogle Scholar
  5. 5.
    Mohan, V.S., Vinayakumar, R., Soman, K.P., Poornachandran, P.: S.P.O.O.F net: syntactic patterns for identification of ominous online factors. In: 2017 IEEE Symposium Security and Privacy (SP), BioSTAR 2018 (2018)Google Scholar
  6. 6.
    Alazab, M.: Profiling and classifying the behavior of malicious codes. J. Syst. Softw. 100, 91–102 (2015)CrossRefGoogle Scholar
  7. 7.
    Huda, S., Abawajy, J., Alazab, M., Abdollalihian, M., Lslam, R., Yearwood, J.: Hybrids of support vector machine wrapper and filter based framework for malware detection. Future Gener. Comput. Syst. 55, 376–390 (2016)CrossRefGoogle Scholar
  8. 8.
    Zhang, X., LeCun, Y.: Text Understanding from Scratch CoRR (2015)Google Scholar
  9. 9.
  10. 10.
  11. 11.
  12. 12.
    Does Alexa have a list of its top-ranked websites? https://support.alexa.com
  13. 13.
    OpenDNS domain list. https://umbrella.cisco.com/blog
  14. 14.
    Security in Computing and Communications (SSCC’18). http://www.acn-conference.org/sscc2018/
  15. 15.
    International Conference in Advances in computing, Communications and Informatics (ICACCI’18). http://icacci-conference.org/2018/
  16. 16.
    Goodfellow, I., Bengio, Y., Courville, A., Bach, F.: Deep Learning. Adaptive Computation and Machine Learning series. MIT Press, Cambridge (2016)Google Scholar
  17. 17.
    Hashemi, H.B., Asiaee, A., Kraft, R.: Query Intent Detection using Convolution Neural Network. WSDM QRUMS (2016)Google Scholar
  18. 18.
    Lenc, L., Kral, P.: Deep Neural Networks for Czech Multi-label Document Classification. CoRR (2017)Google Scholar
  19. 19.
    Hoang, X.D., Nguyen, Q.: Botnet detection based on machine learning techniques using DNS query data. Future Internet MDPI 2018 (2018)Google Scholar
  20. 20.
    Venkatraman S., Alazab, M.: Classification of malware using visualisation of similarity matrices. In: Conference Publishing Services, 8 p. (2017)Google Scholar
  21. 21.
    Rajalakshmi, R.: Identifying health domain URLs using SVM. In: Third International Symposium on Women in Computing and Informatics (WCI–2015), pp. 203–208. ACM (2015).  https://doi.org/10.1145/2791405.2791441
  22. 22.
    Rajalakshmi, R., Aravindan, C.: An effective and discriminative feature learning for URL based web page classification. In: International IEEE Conference on Systems, Man and Cybernetics – SMC 2018 (2018, accepted)Google Scholar
  23. 23.
    Rajalakshmi, R., Aravindan, C.: Web Page Classification using n-gram based URL Features. In: IEEE Proceedings of International Conference on Advanced Computing (ICoAC 2013), pp. 15–21 (2013).  https://doi.org/10.1109/icoac.2013.6921920
  24. 24.
    Rajalakshmi, R., Xavier, S.: Experimental study of feature weighting techniques for URL based web page classification. Procedia Comput. Sci. 115, 218–225 (2017)CrossRefGoogle Scholar
  25. 25.
    Rajalakshmi, R., Aravindan, C.: Naive Bayes approach for website classification. In: Das, V.V., Thomas, G., Lumban Gaol, F. (eds.) AIM 2011. CCIS, vol. 147, pp. 323–326. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-20573-6_55CrossRefGoogle Scholar
  26. 26.
    Rajalakshmi, R., Aravindan, C.: Naive Bayes Approach for URL Classification with Supervised Feature Selection and Rejection Framework, Computational Intelligence, Wiley (2018).  https://doi.org/10.1111/coin.12158MathSciNetCrossRefGoogle Scholar
  27. 27.
    Sivakumar, S., Rajalakshmi, R.: Comparative evaluation of various feature weighting methods on movie reviews. In: Behera, H.S., Nayak, J., Naik, B., Abraham, A. (eds.) Computational Intelligence in Data Mining. AISC, vol. 711, pp. 721–730. Springer, Singapore (2019).  https://doi.org/10.1007/978-981-10-8055-5_64CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.School of Computing Science and EngineeringVellore Institute of TechnologyChennaiIndia

Personalised recommendations