Windows malware detection system based on LSVC recommended hybrid features

  • S. L. Shiva DarshanEmail author
  • C. D. Jaidhar
Original Paper


To combat exponentially evolved modern malware, an effective Malware Detection System and precise malware classification is highly essential. In this paper, the Linear Support Vector Classification (LSVC) recommended Hybrid Features based Malware Detection System (HF-MDS) has been proposed. It uses a combination of the static and dynamic features of the Portable Executable (PE) files as hybrid features to identify unknown malware. The application program interface calls invoked by the PE files during their execution along with their correspondent category are collected and considered as dynamic features from the PE file behavioural report produced by the Cuckoo Sandbox. The PE files’ header details such as optional header, disk operating system header, and file header are treated as static features. The LSVC is used as a feature selector to choose prominent static and dynamic features from their respective Original Feature Space. The features recommended by the LSVC are highly discriminative and used as final features for the classification process. Different sets of experiments were conducted using real-world malware samples to verify the combination of static and dynamic features, which encourage the classifier to attain high accuracy. The tenfold cross-validation experimental results demonstrate that the proposed HF-MDS is proficient in precisely detecting malware and benign PE files by attaining detection accuracy of 99.743% with sequential minimal optimization classifier consisting of hybrid features.


Cuckoo Sandbox Feature selector Malware N-grams Portable executable files 


  1. 1.
    Alam, S., Traore, I., Sogukpinar, I.: Annotated control flow graph for metamorphic malware detection. Comput. J. 58(10), 2608–2621 (2014)CrossRefGoogle Scholar
  2. 2.
    Allix, K., Bissyandé, T.F., Jérome, Q., Klein, J., Le Traon, Y.: Empirical assessment of machine learning-based malware detectors for android. Empir. Softw. Eng. 21(1), 183–211 (2016)CrossRefGoogle Scholar
  3. 3.
    Amin, M.: A survey of financial losses due to malware. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, p. 145. ACM (2016)Google Scholar
  4. 4.
    Awan, S., Saqib, N.A.: Detection of malicious executables using static and dynamic features of portable executable (pe) file. In: Security, Privacy and Anonymity in Computation, Communication and Storage: SpaCCS 2016 International Workshops, TrustData, TSP, NOPE, DependSys, BigDataSPT, and WCSSC, Zhangjiajie, China, November 16–18, 2016, Proceedings 9, pp. 48–58. Springer, New York (2016)Google Scholar
  5. 5.
    Bai, J., Wang, J., Zou, G.: A malware detection scheme based on mining format information. Sci. World J. 2014 (2014)Google Scholar
  6. 6.
    Bayer, U., Moser, A., Kruegel, C., Kirda, E.: Dynamic analysis of malicious code. J. Comput. Virol. 2(1), 67–77 (2006)CrossRefGoogle Scholar
  7. 7.
    Bazrafshan, Z., Hashemi, H., Fard, S.M.H., Hamzeh, A.: A survey on heuristic malware detection techniques. In: Information and Knowledge Technology (IKT), 2013 5th Conference on, pp. 113–120. IEEE (2013)Google Scholar
  8. 8.
    Belaoued, M., Mazouzi, S.: A real-time pe-malware detection system based on chi-square test and pe-file features. In: IFIP International Conference on Computer Science and its Applications __, pp. 416–425. Springer, New York (2015)Google Scholar
  9. 9.
    Bounouh, T., Brahimi, Z., Al-Nemrat, A., Benzaid, C.: A scalable malware classification based on integrated static and dynamic features. In: International Conference on Global Security, Safety, and Sustainability, pp. 113–124. Springer, New York (2017)Google Scholar
  10. 10.
    Calleja, A., Tapiador, J., Caballero, J.: A look into 30 years of malware development from a software metrics perspective. In: International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 325–345. Springer, New York (2016)Google Scholar
  11. 11.
    Das, S., Liu, Y., Zhang, W., Chandramohan, M.: Semantics-based online malware detection: towards efficient real-time protection against malware. IEEE Trans. Inf. Forensics Secur. 11(2), 289–302 (2016)CrossRefGoogle Scholar
  12. 12.
    Firdausi, I., Erwin, A., Nugroho, A.S., et al.: Analysis of machine learning techniques used in behavior-based malware detection. In: Advances in Computing, Control and Telecommunication Technologies (ACT), 2010 Second International Conference on, pp. 201–203. IEEE (2010)Google Scholar
  13. 13.
    Gandotra, E., Bansal, D., Sofat, S.: Integrated framework for classification of malwares. In: Proceedings of the 7th International Conference on Security of Information and Networks, p. 417. ACM (2014)Google Scholar
  14. 14.
    Guarnieri, C., Tanasi, A., Bremer, J., Schloesser, M.: Automated malware analysis-cuckoo sandbox (2012)Google Scholar
  15. 15.
    Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Netw. Comput. Appl. 36(2), 646–656 (2013)CrossRefGoogle Scholar
  16. 16.
    Kawaguchi, N., Omote, K.: Malware function classification using APIs in initial behavior. In: Information Security (AsiaJCIS), 2015 10th Asia Joint Conference on, pp. 138–144. IEEE (2015)Google Scholar
  17. 17.
    Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Kumar, A., Kuppusamy, K., Aghila, G.: A learning model to detect maliciousness of portable executable using integrated feature set. J. King Saud Univ.-Comput. Inf. Sci. (2017)Google Scholar
  19. 19.
    Lengyel, T.K., Maresca, S., Payne, B.D., Webster, G.D., Vogl, S., Kiayias, A.: Scalability, fidelity and stealth in the drakvuf dynamic malware analysis system. In: Proceedings of the 30th Annual Computer Security Applications Conference, pp. 386–395. ACM (2014)Google Scholar
  20. 20.
    Masud, M.M., Khan, L., Thuraisingham, B.: A scalable multi-level feature extraction technique to detect malicious executables. Inf. Syst. Front. 10(1), 33–45 (2008)CrossRefGoogle Scholar
  21. 21.
    Miller, C., Glendowne, D., Cook, H., Thomas, D., Lanclos, C., Pape, P.: Insights gained from constructing a large scale dynamic analysis platform. Digit. Investig. 22, S48–S56 (2017)CrossRefGoogle Scholar
  22. 22.
    Mohaisen, A., Alrawi, O., Mohaisen, M.: Amal: high-fidelity, behavior-based automated malware analysis and classification. Comput. Secur. 52, 251–266 (2015)CrossRefGoogle Scholar
  23. 23.
    Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Computer Security Applications Conference, 2007. ACSAC 2007. Twenty-Third Annual, pp. 421–430. IEEE (2007)Google Scholar
  24. 24.
    Noor, M., Abbas, H., Shahid, W.B.: Countering cyber threats for industrial applications: an automated approach for malware evasion detection and analysis. J. Netw. Comput. Appl. (2017)Google Scholar
  25. 25.
    Pekalska, E., Duin, R.P.: Classifiers for dissimilarity-based pattern recognition. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on, vol. 2, pp. 12–16. IEEE (2000)Google Scholar
  26. 26.
    Pektaş, A., Eriş, M., Acarman, T.: Proposal of n-gram based algorithm for malware classification. In: The Fifth International Conference on Emerging Security Information, Systems and Technologies, pp. 7–13 (2011)Google Scholar
  27. 27.
    Qiao, Y., Yang, Y., He, J., Tang, C., Liu, Z.: CBM: free, automatic malware analysis framework using API call sequences. In: Knowledge Engineering and Management, pp. 225–236. Springer, New York (2014)Google Scholar
  28. 28.
    Reddy, D.K.S., Pujari, A.K.: N-gram analysis for computer virus detection. J. Comput. Virol. 2(3), 231–239 (2006)CrossRefGoogle Scholar
  29. 29.
    Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)CrossRefGoogle Scholar
  30. 30.
    Saleh, M., Li, T., Xu, S.: Multi-context features for detecting malicious programs. J. Comput. Virol. Hacking Tech., pp. 1–13 (2017)Google Scholar
  31. 31.
    Salehi, Z., Sami, A., Ghiasi, M.: Maar: Robust features to detect malicious activity based on api calls, their arguments and return values. Eng. Appl. Artif. Intell. 59, 93–102 (2017)CrossRefGoogle Scholar
  32. 32.
    Santos, I., Brezo, F., Nieves, J., Penya, Y.K., Sanz, B., Laorden, C., Bringas, P.G.: Idea: Opcode-sequence-based malware detection. In: International Symposium on Engineering Secure Software and Systems, pp. 35–43. Springer, New York (2010)Google Scholar
  33. 33.
    Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static-dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE 12-SOCO 12 Special Sessions, pp. 271–280. Springer, New York (2013)Google Scholar
  35. 35.
    Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: DCAI, pp. 415–422. Springer, New York (2011)Google Scholar
  36. 36.
    Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, pp. 38–49. IEEE (2001)Google Scholar
  37. 37.
    Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)CrossRefGoogle Scholar
  38. 38.
    Shahzad, R.K., Haider, S.I., Lavesson, N.: Detection of spyware by mining executable files. In: Availability, Reliability, and Security, 2010. ARES’10 International Conference on, pp. 295–302. IEEE (2010)Google Scholar
  39. 39.
    Sharma, A., Sahay, S.K.: Evolution and detection of polymorphic and metamorphic malwares: a survey. arXiv preprint arXiv:1406.7061 (2014)
  40. 40.
    Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Procedia Comput. Sci. 46, 804–811 (2015)CrossRefGoogle Scholar
  41. 41.
    Touchette, F.: The evolution of malware. Netw. Secur. 2016(1), 11–14 (2016)CrossRefGoogle Scholar
  42. 42.
    Vinod, P., Laxmi, V., Gaur, M.S.: Scattered feature space for malware analysis. In: Advances in Computing and Communications, pp. 562–571 (2011)Google Scholar
  43. 43.
    Willems, C., Holz, T., Freiling, F.: Toward automated dynamic malware analysis using cwsandbox. IEEE Security and Privacy 5(2) (2007)Google Scholar
  44. 44.
    Ye, Y., Chen, L., Wang, D., Li, T., Jiang, Q., Zhao, M.: SBMDS: an interpretable string based malware detection system using svm ensemble with bagging. J. Comput. Virol. 5(4), 283–293 (2009)CrossRefGoogle Scholar
  45. 45.
    Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining techniques. ACM Comput. Surv. 50(3), 41 (2017)CrossRefGoogle Scholar
  46. 46.
    Ye, Y., Li, T., Chen, Y., Jiang, Q.: Automatic malware categorization using cluster ensemble. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 95–104. ACM (2010)Google Scholar

Copyright information

© Springer-Verlag France SAS, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Information TechnologyNational Institute of Technology KarnatakaSurathkalIndia

Personalised recommendations