Construction and evaluation of the new heuristic malware detection mechanism based on executable files static analysis

  • A. V. KozachokEmail author
  • V. I. Kozachok
Original Paper


The paper presents the application justification of a new set of features collected at the stage of the static analysis of the executable files to address the problem of malicious code detection. In the course of study the following problems were solved: the development of the executable files classifier in the absence of a priori data concerning their functionality; designing the class models of uninfected files and malware during the learning process; the development of malicious code detection procedure using the neural networks mathematical apparatus and decision tree composition relating to the set of features specified on the basis of the executable files static analysis. The paper contains the results of experimental evaluation of the developed detection mechanism efficiency on the basis of neural networks (accuracy was 0.99125) and decision tree composition (accuracy was 0.99240). The obtained data confirmed the hypothesis about the possibility of constructing the heuristic malware analyzer on the basis of features selected during the static analysis of the executable files.


Anti-virus protection Malware Neural networks Decision trees Heuristic analysis Machine learning 


  1. 1.
  2. 2.
    Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niculae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt, B., Varoquaux, G.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)Google Scholar
  3. 3.
    David, B., Filiol, E., Gallienne, K.: Structural analysis of binary executable headers for malware detection optimization. J. Comput. Virol. Hacking Tech. 13(2), 87–93 (2017). CrossRefGoogle Scholar
  4. 4.
    Federal Service for Technology and Export Control: Informational report on antivirus software requirements approval (2012) (in Russian) Google Scholar
  5. 5.
    Kingma, D., Adam, J.B.: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  6. 6.
    Kozachok, A.V.: Mathematical model of destructive software recognition tools based on hidden markov models. Vestnik SibGUTI 3, 29–39 (2012). (in Russian)Google Scholar
  7. 7.
    Ochsenmeier, M.: Pestudio—malware initial assesment (2017)
  8. 8.
    Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: Opem: a static–dynamic approach for machine-learning-based malware detection. In: International Joint Conference CISIS12-ICEUTE’ 12-SOCO’ 12 Special Sessions, pp. 271–280. Springer, Berlin (2013)Google Scholar
  9. 9.
    Schmid, H.: Probabilistic Part-of-Speech Tagging Using Decision Trees. UMIST, Manchester (1994)Google Scholar
  10. 10.
    Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Information Security Technical Report 14(1), 16–29 (2009).
  11. 11.
    Shi, T., Horvath, S.: Unsupervised learning with random forest predictors. J. Comput. Graph. Stat. 15(1), 118–138 (2006)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Siddiqui, M., Wang, M.C., Lee, J.: A survey of data mining techniques for malware detection using file features. In: Proceedings of the 46th Annual Southeast Regional Conference on XX, ACM-SE 46, pp. 509–510. ACM, New York (2008).
  13. 13.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014).

Copyright information

© Springer-Verlag France SAS 2017

Authors and Affiliations

  1. 1.Academy of the Federal Guard ServiceOryolRussia

Personalised recommendations