Skip to main content

Machine Learning for Static Malware Analysis

  • Living reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Science

Abstract

Malicious software (malware) is a term that describes any malicious program or code that is designed to impose harm or to steal information from systems. It includes various types such as viruses, worms, and Trojan horses. Malware imposes tremendous threats to everyone in contact with the cyberworld. Hence, malware analysis has been extensively researched as the versatility and number of malware have increased dramatically. Until recently, signature-based detection has been prevailing in detecting malware. However, it is becoming ineffective as it relies on detecting malware that was already seen in the past. To countermeasure those new types of malware, there has been a rise in engineering machine learning-based malware detection and analysis techniques. It has seen massive growth in its development thanks to its effectiveness, swiftness, safety, and depth of investigation of malware samples. Static malware analysis relies on examining the static content of an executable without execution. This can be conducted by obtaining features statically such as API calls, binary sequences, and control flow graphs (CFGs). However, this area of research is still growing since packed files and other obfuscation techniques used to evade analysis remain a challenge for pure static analysis methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  • Bell T (1999) The concept of dynamic analysis. In: Proceedings of the 7th European software engineering conference (ESEC’99). Lecture notes in computer science, vol 1687. Springer, pp 216–234

    Google Scholar 

  • Bergeron J, Debbabi M, Desharnais J, Erhioui M, Lavoie Y, Tawbi N (2000) Static detection of malicious code in executable programs

    Google Scholar 

  • Chen L (2018) Deep transfer learning for static malware classification, CoRR abs/1812.07606

    Google Scholar 

  • Devi D, Nandi S (2012) Detection of packed malware. In: Proceedings of the 1st international conference on security of internet of things, SECURIT’12. ACM, pp 22–26

    Google Scholar 

  • Egele M, Scholte T, Kirda E, Kruegel C (2012) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv 44(2):6:1–6:42

    Google Scholar 

  • Gandotra E, Bansal D, Sofat S (2014) Malware analysis and classification: a survey. J Inf Secur 05:56–64

    Google Scholar 

  • Hall MA, Smith LA (1998) Practical feature subset selection for machine learning. In: Proceedings of the 21st Australasian Computer Science Conference (ACSC 1998), 4–6 Feb 1998, Berlin, Springer, 20(1):181–191

    Google Scholar 

  • Henchiri O, Japkowicz N (2006) A feature selection and evaluation scheme for computer virus detection. In: Proceedings of the 6th IEEE international conference on data mining (ICDM 2006), 18–22 Dec 2006, Hong Kong. IEEE Computer Society, pp 891–895

    Google Scholar 

  • Islam MR, Tian R, Batten LM, Versteeg S (2013) Classification of malware based on integrated static and dynamic features. J Netw Comput Appl 36(2):646–656

    Article  Google Scholar 

  • Kolter JZ, Maloof MA (2004) Learning to detect malicious executables in the wild. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 470–478

    MATH  Google Scholar 

  • Li MQ, Fung BCM, Charland P, Ding SHH (2019) I-MAD: a novel interpretable malware detector using hierarchical transformer, CoRR abs/1909.06865

    Google Scholar 

  • Ma X, Biao Q, Yang W, Jiang J (2016) Using multi-features to reduce false positive in malware classification. In: Proceedings of the 2016 IEEE information technology, networking, electronic and automation control conference, pp 361–365

    Google Scholar 

  • Mohamed GAN, Ithnin N (2017) Survey on representation techniques for malware detection system. Am J Appl Sci 14:1049–1069

    Article  Google Scholar 

  • Mosli R, Li R, Yuan B, Pan Y (2016) Automated malware detection using artifacts in forensic memory images. In: Proceedings of the 2016 IEEE symposium on technologies for homeland security (HST), pp 1–6

    Google Scholar 

  • Mosli R, Li R, Yuan B, Pan Y (2017) A behavior-based approach for malware detection, pp 187–201

    Google Scholar 

  • Nath HV, Mehtre BM (2014) Static malware analysis using machine learning methods. In: Martinez Perez G, Thampi SM, Ko R, Shu L (eds) Recent trends in computer networks and distributed systems security. Springer, Berlin/Heidelberg

    Google Scholar 

  • Naval S, Laxmi V, Rajarajan M, Gaur M, Conti M (2015) Employing program semantics for malware detection. IEEE Trans Inf Foren Secur 10:2591–2604

    Article  Google Scholar 

  • Or-Meir O, Nissim N, Elovici Y, Rokach L (2019) Dynamic malware analysis in the modern era – a state of the art survey. ACM Comput Surv 52(5):88:1–88:48

    Google Scholar 

  • Ramesh G, Menen A (2020) Automated dynamic approach for detecting ransomware using finite-state machine. Decis Support Syst 138:113400

    Article  Google Scholar 

  • Rathnayaka C, Jamdagni A (2017) An efficient approach for advanced malware analysis using memory forensic technique. In: Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, pp 1145–1150

    Google Scholar 

  • Santos I, Devesa J, Brezo F, Nieves J, Bringas PG (2012) OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Proceedings of the international joint conference CISIS’12-ICEUTE’12-SOCO’12. Advances in intelligent systems and computing, vol 189. Springer, pp 271–280

    Google Scholar 

  • Saxe J, Berlin K (2015) Deep neural network based malware detection using two dimensional binary program features

    Google Scholar 

  • Schultz MG, Eskin E, Zadok E, Stolfo SJ (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE symposium on security and privacy. IEEE Computer Society, pp 38–49

    Google Scholar 

  • Shalaginov A, Banin S, Dehghantanha A, Franke K (2018) Machine learning aided static malware analysis: a survey and tutorial, CoRR abs/1808.01201

    Google Scholar 

  • Shijo P, Salim A (2015) Integrated static and dynamic analysis for malware detection. Proc Comput Sci 46:804–811

    Article  Google Scholar 

  • Siddiqui M, Wang M, Lee J (2009) Detecting internet worms using data mining techniques. J Syst Cybern Inform 6:48–53

    Google Scholar 

  • Sophos, Zaki AM, Humphrey B (2014) The kernel: rootkit discovery using selective automated kernel memory differencing

    Google Scholar 

  • Souri A, Hosseini R (2018) State-of-the-art survey of malware detection approaches using data mining techniques. Hum-Centric Comput Inf Sci 8:1–22

    Article  Google Scholar 

  • Stoecklin MP, Jang J, Kirat D (2020) DeepLocker–Concealing targeted attacks with AI Locksmithing. In Proceedings of the Black Hat USA Conference

    Google Scholar 

  • Wang T, Horng S, Su M, Wu C, Wang P, Su W (2006) A surveillance spyware detection system based on data mining methods. In: Proceedings of the IEEE international conference on evolutionary computation, CEC 2006. IEEE, pp 3236–3241

    Google Scholar 

  • Ye Y, Li T, Adjeroh DA, Iyengar SS (2017) A survey on malware detection using data mining techniques. ACM Comput Surv 50(3):41:1–41:40

    Google Scholar 

  • Yu Z, Cao R, Tang Q, Nie S, Huang J, Wu S (2020) Order matters: semantic-aware neural networks for binary code similarity detection. Proc AAAI Conf Artif Intell 34:1145–1152

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven H. H. Ding .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Mansour, Z., Molloy, C., Ding, S.H.H. (2022). Machine Learning for Static Malware Analysis. In: Phung, D., Webb, G.I., Sammut, C. (eds) Encyclopedia of Machine Learning and Data Science. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7502-7_981-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7502-7_981-1

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4899-7502-7

  • Online ISBN: 978-1-4899-7502-7

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics