Skip to main content
Log in

Music classification as a new approach for malware detection

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Each year, a huge number of malicious programs are released which causes malware detection to become a critical task in computer security. Antiviruses use various methods for detecting malware, such as signature-based and heuristic-based techniques. Polymorphic and metamorphic malwares employ obfuscation techniques to bypass traditional detection methods used by antiviruses. Recently, the number of these malware has increased dramatically. Most of the previously proposed methods to detect malware are based on high-level features such as opcodes, function calls or program’s control flow graph (CFG). Due to new obfuscation techniques, extracting high-level features is tough, fallible and time-consuming; hence approaches using program’s bytes are quicker and more accurate. In this paper, a novel byte-level method for detecting malware by audio signal processing techniques is presented. In our proposed method, program’s bytes are converted to a meaningful audio signal, then Music Information Retrieval (MIR) techniques are employed to construct a machine learning music classification model from audio signals to detect new and unseen instances. Experiments evaluate the influence of different strategies converting bytes to audio signals and the effectiveness of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Moir, R.: Defining Malware: FAQ. Microsoft TechNet. https://technet.microsoft.com/en-us/library/dd632948.aspx (2003). Accessed 17 Feb 2017

  2. Symantec.: Internet Security Threat Report, Volume 17. Technical report, Symantec Corporation (2011). http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_2011_21239364.en-us.pdf. Accessed 19 May 2018

  3. Vinod, P., Jaipur, R., Laxmi, V., Gaur, M.: Survey on malware detection methods. In: Proceedings of the 3rd Hackers’ Workshop on Computer and Internet Security (IITKHACK’09), pp. 74–79 (2009)

  4. Wong, W.: Analysis and detection of metamorphic computer viruses. Department of Computer Science, San Jose State University, May, Master’s Thesis (2006)

  5. Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.P.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. (Ny) 231, 64–82 (2013)

    Article  MathSciNet  Google Scholar 

  6. Typke, R., Wiering, F., Veltkamp, R.C.: A survey of music information retrieval systems. In: ISMIR, pp. 153–160 (2005)

  7. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)

    Article  Google Scholar 

  8. Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1(1), 19–22 (2010)

    MathSciNet  Google Scholar 

  9. Zhou, Y., Inge, W.M.: Malware detection using adaptive data compression. In: Proceedings of the 1st ACM Workshop on Workshop on AISec, pp. 53–60 (2008)

  10. Khorsand, Z., Hamzeh, A.: A novel compression-based approach for malware detection using PE header. In: 2013 5th Conference on IEEE Information and Knowledge Technology (IKT), pp. 127–133 (2013)

  11. Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings. 2001 IEEE Symposium on Security and Privacy, 2001. S\(\backslash \)&P 2001, pp. 38–49 (2001)

  12. Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7(Dec), 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  13. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B. S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, vol. 4 (2011)

  14. Han, K.S., Lim, J.H., Kang, B., Im, E.G.: Malware analysis using visualized images and entropy graphs. Int. J. Inf. Secur. 14(1), 1–14 (2015)

    Article  Google Scholar 

  15. Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 21–30 (2011)

  16. Hashemi, H., Azmoodeh, A., Hamzeh, A., Hashemi, S.: Graph embedding as a new approach for unknown malware detection. J. Comput. Virol. Hacking Tech. 13(3), 153–166 (2017)

    Article  Google Scholar 

  17. Yu, X., Zhang, J., Liu, J., Wan, W., Yang, W.: An audio retrieval method based on chromagram and distance metrics. In: 2010 International Conference on. IEEE Audio Language and Image Processing (ICALIP), pp. 425–428 (2010)

  18. Harrington, P.: Machine Learning in Action, no. 3, vol. 37. Manning Publications Co., Greenwich, CT, USA (2012)

  19. FluidSynth 2.0. http://www.fluidsynth.org/, Accessed 17 Feb 2017

  20. Giannakopoulos, T.: pyAudioAnalysis: an open-source python library for audio signal analysis. PLoS ONE 10(12), 1–17 (2015)

    Article  Google Scholar 

  21. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  22. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Vanderplas, J.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  23. Microsoft Malware Classification Challenge (BIG 2015), Kaggle. https://www.kaggle.com/c/malware-classification. Accessed 17 Feb 2017

  24. Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–39 (2011)

    MathSciNet  Google Scholar 

  25. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 1995 International Joint Conference on Artificial Intelligence, vol. 14, no. 2, pp. 1137–1145 (1995)

  26. Dodge, C., Jerse, T.A.: Computer music: synthesis, composition and performance. Macmillan Library Reference, Hampshire (1997)

    Google Scholar 

  27. Bello, J. P.: MIDI Code, NewYork University. https://www.nyu.edu/classes/bello/FMT_files/9_MIDI_code.pdf. Accessed 14 May 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Hamzeh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farrokhmanesh, M., Hamzeh, A. Music classification as a new approach for malware detection. J Comput Virol Hack Tech 15, 77–96 (2019). https://doi.org/10.1007/s11416-018-0321-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-018-0321-2

Keywords

Navigation