Abstract
In this work we present a graph-based approach for behavior-based malware detection and classification utilizing the Group Relation Graphs (GrG), resulting after the grouping of disjoint vertices of System-call Dependency Graphs obtained through the dynamic taint analysis over after the execution of a program. Throughout this approach we utilize the sequence on the appearance of each edge in the GrG graph in order to depict the information regarding the sequential dependencies between the System-calls groups invoked during the execution of a program, proposing the so-called Group Sequence Graphs (GsG). Utilizing the proposed approach, we investigate further valuable structural characteristics of the graphs augmenting the GrG with further information that increase their potentials against the representation of mutated malware samples. We develop an integrated behavior-based malware detection and classification system that incorporates the proposed approach, utilizing different types of structural characteristics of GsG graphs, namely, the Relational, the Quantitative and the Qualitative characteristics, evaluating its potentials on distinguishing malicious from benign samples and indexing the malicious ones into known malware families, proving it potentials against a set of malicious samples from a wide variety of known malware families.
Similar content being viewed by others
References
Babic, D., Reynaud, D., Song, D.: Malware analysis with tree automata inference. In: Proceedings of the 23rd International Conference on Computer Aided Verification (CAV’11), pp. 116–131 (2011)
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Third International AAAI Conference on weblogs and Social Media (2009)
Canzanese, R., Kam, M., Mancoridis, S.: Toward an automatic, online behavioral malware classification system. In: 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems, pp. 111–120. IEEE (2013)
Chaumette, S., Ly, O., Tabary, R.: Automated extraction of polymorphic virus signatures using abstract interpretation. In: 2011 5th International Conference on IEEE Network and System Security (NSS) (2011)
Chysi, A., Nikolopoulos, S.D., Polenakis, I.: An algorithmic framework for malicious software detection exploring structural characteristics of behavioral graphs. In: Proceedings of the 21st International Conference on Computer Systems and Technologies’ 20, pp. 43–50
Christodorescu, M., Jha, S., Seshia, A., Song, D., Bryant, R.E.: Semantics-aware malware detection. In: 2005 IEEE Symposium on Security and Privacy (S &P’05) (2005)
Christodorescu, M., Jha, S., Kruegel, C.: Mining specifications of malicious behavior. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (2007)
Ding, Y., Xia, X., Chen, S., Li, Y.: A Malware detection method based on family behavior graph. Comput. Secur. 73, 73–86 (2018)
Eskandari, R., Shajari, M., Ghahfarokhi, M.M.: ERES: an extended regular expression signature for polymorphic worm detection. J. Comput. Virol. Hack. Tech. 15(3), 177–194 (2019)
Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: 2010 IEEE Symposium on IEEE Security and Privacy (SP), pp. 45–60 (2010)
Garg, V., Yadav, R.K.: Malware detection based on API calls frequency. In: 2019 4th International Conference on Information Systems and Computer Networks (ISCON), pp. 400–404. IEEE (2019)
Hashemi, H., Azmoodeh, A., Hamzeh, A., Hashemi, S.: Graph embedding as a new approach for unknown malware detection. J. Comput. Virol. Hack. Tech. 13(3), 153–166 (2017)
Hashemi, H., Hamzeh, A.: Visual malware detection using local malicious pattern. J. Comput. Virol. Hack. Tech. 15(1), 1–14 (2019)
Hassen, M., Chan, P.K.: Scalable function call graph-based malware classification. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp. 239–248. ACM (2017)
Hu, X., Chiueh, T., Shin, K.G.: Large-scale malware indexing using function-call graphs. In: Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS’09), pp. 611–620 (2009)
John, T.S., Thomas, T., Emmanuel, S.: Graph convolutional networks for android malware detection with system call graphs. In: ISEA Conference on Security and Privacy (ISEA-ISAP), pp. 162–170. IEEE (2020)
Karim, M.E., Walenstein, A., Lakhotia, A., Parida, L.: Malware phylogeny generation using permutations of code. J. Comput. Virol. 1(1–2), 13–23 (2005)
Kim, H., Kim, J., Kim, Y., Kim, I., Kim, K.J., Kim, H.: Improvement of malware detection and classification using API call sequence alignment and visualization. Clust. Comput. 22(1), 921–929 (2019)
Kozachok, A.V., Kozachok, V.I.: Construction and evaluation of the new heuristic malware detection mechanism based on executable files static analysis. J. Comput. Virol. Hack. Tech. 14(3), 225–231 (2018)
Mathur, K., Hiranwal, S.: A survey on techniques in detection and analyzing malware executables. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 22–428 (2013)
Makandar, A., Patrot, A.: Trojan malware image pattern classification. In: Proceedings of International Conference on Cognition and Recognition, pp. 253–262. Springer, Singapore (2018)
Ming, J., Xu, D., Wu, D.: MalwareHunt: semantics-based malware diffing speedup by normalized basic block memoization. J. Comput. Virol. Hack. Tech. 13(3), 167–178 (2017)
Mohaisen, A., West, A.G., Mankin, A., Alrawi, O.: Chatter: classifying malware families using system event ordering. In: 2014 IEEE Conference on Communications and Network Security, pp. 283–291. IEEE (2014)
Mukesh, S.D., Raval, J.A., Upadhyay, H.: Real-time framework for malware detection using machine learning technique. In: International Conference on Information and Communication Technology for Intelligent Systems, pp. 173–182. Springer, Cham (2017)
Newsome, J., Song, D.: Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of the 12th Annual Network and Distributed System Security Symposium (NDSS05) (2005)
Nikolopoulos, S.D., Polenakis, I.: A graph-based model for malicious code detection exploiting dependencies of system-call groups. In: Proceedings of the 16th International Conference on Computer Systems and Technologies, pp. 228–235 (2015)
Nikolopoulos, S.D., Polenakis, I.: A graph-based model for malware detection and classification using system-call groups. J. Comput. Virol. Hack. Tech. 13(1), 29–46 (2017)
Mpanti, A., Nikolopoulos, S.D., Polenakis, I.: A graph-based model for malicious software detection exploiting domination relations between system-call groups. In: Proceedings of the 19th International Conference on Computer Systems and Technologies, pp. 20–26 (2018)
Rezaei, T., Hamze, A.: An efficient approach for malware detection using PE header specifications. In: 2020 6th International Conference on Web Research (ICWR), pp. 234–239. IEEE (2020)
Sami, A., Yadegari, B., Rahimi, H., Peiravian, N., Hashemi, S., Hamze, A.: Malware detection based on mining API calls. In: Proceedings of the 2010 ACM Symposium on Applied Computing, pp. 1020–1025 (2010)
Suaboot, J., Tari, Z., Mahmood, A., Zomaya, A., Li, W.: Sub-curve HMM: a malware detection approach based on partial analysis of API call sequences. Comput. Secur. 92, 101773 (2020)
Szor, P., Ferrie, P.: Hunting for metamorphic. In: Virus Bulletin Conference (2001)
VirusTotal. https://www.virustotal.com/gui/home/upload. Accessed Jan 2022
Walenstein, A., Lakhotia, A.: The software similarity problem in malware analysis. Internat. Begegnungs-und Forschungszentrum fur Informatik (2007)
Wüchner, T., Ochoa, M., Pretschner, A.: Robust and effective malware detection through quantitative data flow graph metrics. In: International Conference on Detection of Intrusions and Malware and Vulnerability Assessment, pp. 98–118. Springer, Cham (2015)
Wüchner, T., Ochoa, M., Pretschner, A.: Malware detection with quantitative data flow graphs. In: Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, pp. 271–282 (2014)
Xiao, F., Lin, Z., Sun, Y., Ma, Y.: Malware detection based on deep learning of behavior graphs. Math. Probl. Eng. (2019)
Xiao, F., Sun, Y., Du, D., Li, X., Luo, M.: A novel malware classification method based on crucial behaviour. Math. Probl. Eng. (2020)
Xu, M., Wu, L., Qi, S., Xu, J., Zhang, H., Ren, Y., Zheng, N.: A similarity metric method of obfuscated malware using function-call graph. J. Comput. Virol. Hack. Tech. 56, 35–47 (2013)
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: Proceedings of the 5th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA’10), pp. 297–300 (2010)
Zhong, Y., Yamaki, H., Takakura, H.: A malware classification method based on similarity of function structure. In: 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, pp. 256–261. IEEE (2012)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Funding
This research is co-financed by Greece and the European Union (European Social Fund-ESF) through the Operational Programme “Human Resources Development, Education and Lifelong Learning” in the context of the project “Reinforcement of Postdoctoral Researchers—2nd Cycle” (MIS-5033021), implemented by the State Scholarships Foundation (IKY).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nikolopoulos, S.D., Polenakis, I. Behavior-based detection and classification of malicious software utilizing structural characteristics of group sequence graphs. J Comput Virol Hack Tech 18, 383–406 (2022). https://doi.org/10.1007/s11416-022-00423-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-022-00423-4