Abstract
One of the major problems concerning information security is malicious code. To evade detection, malware (unwanted malicious piece of code) is packed, encrypted, and obfuscated to produce variants that continue to plague properly defended and patched systems and networks with zero day exploits. Zero day exploits are used by the attackers to compromise victims computer before the developer of the target software knows about the vulnerability.
In this chapter we present a method of functionally classifying malicious code that might lead to automated attacks and intrusions using computational intelligent techniques and similarity measures. We study the performance of kernel methods in the context of robustness and generalization capabilities of malware classification.
Results from our recent experiments indicate that similarity measures can be utilized to determine the likelihood that a piece of code or binary under inspection contains a particular malware. Malware variants of a particular malware family show very high similarity scores (over 85%). Interestingly Trojans and hacking tools have high similarity scores with other Trojans and hacking tools.
Our results also show that malware analysis based on the API calling sequence and API frequency that reflects the behavior of a particular piece of code gives good accuracy to classify malware. We also show that classification accuracy varies with the kernel type and the parameter values; thus, with appropriately chosen parameter values, malware can be detected by support vector machines (SVM) with higher accuracy and lower rates of false alarms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Symantec Threat Report January (accessed January 20, 2011), http://www.symantec.com/content/en/us/enterprise/other_resources/bsymantec_report_on_attack_kits_and_malicious_websites_21169171_WP.en-us.pdf
Nachenberg, C.: Computer virus-antivirus co-evolution. Communications of the ACM 40(1), 46–51 (1997)
Sanok Jr, D.J.: An analysis of how antivirus methodologies are utilized in protecting computers from malicious code. In: Information Security Curriculum Development (InfoSecCD) Conference, Kennesaw, GA, USA (1995)
Kephart, J., Arnold, W.: Automatic extraction of computer virus signatures. In: Proceedings of 4th Virus Bulletin International Conference, pp. 178–184 (1994)
Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th USENIX Security Symposium (2003)
Rabek, J., Khazan, R., Lewandowski, S., Cunningham, R.: Detection of injected, dynamically generated, and obfuscated malicious code. In: Proceedings of ACM workshop on Rapid malcode, pp. 76–82 (2003)
Schultz, M., Eskin, E., Zadok, E.: Data mining methods for detection of new malicious executables. In: Proceedings of IEEE International Conference on Data Mining (2001)
Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. In: Proceedings of KDD 2004 (2004)
Yanfang, Y., Wang, D., Li, T., Ye, D.: IMDS: Intelligent Malware Detection System. In: Proceedings of KDD 2007 (2007)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of KDD 1998 (1998)
Shen, Y., Yang, Q., Zhang, Z.: Objective-oriented utility-based association mining. In: Proceedings of IEEE International Conference on Data Mining (2002)
Provos, N., McNamee, D., Mavrommatis, P., Wang, K., Modadugu, N.: The Ghost In The Browser Analysis of Web-based Malware. Google, Inc. (2007)
Offensive Computing, http://offensivecomputing.net (accessed January 25, 2011)
Christodorescu, M., Kinder, J., Jha, S., Katzenbeisser, S., Veith, H.: Malware normalization. Technical Report 1539, University of Wisconsin, Madison, Wisconsin, USA (2005)
TechniZe Team, http://www.technize.com/zeus-trojan-and-password-stealer-detection-and-removal (accessed January 20, 2011)
Tarakanov, D.: http://www.securelist.com/en/analysis/204792107/ZeuS_on_the_Hunt?print_mode=1 (accessed January 15, 2011)
Pietrek, M.: (2002), http://msdn.microsoft.com/en-us/magazine/cc301805.aspx (accessed January 8, 2011)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACMÂ 18(11) (1975)
Cherkassy, V.: Model complexity control and statistical learning theory. Journal of Natural Computing 1, 109–133 (2002)
Lee, J.H., Lin, C.J.: Automatic model selection for support vector machines. Technical Report, Department of Computer Science and Information Engineering, National Taiwan University (2000)
Egan, J.P.: Signal detection theory and ROC analysis. Academic Press, New York (1975)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Shankarpani, M.K., Kancherla, K., Movva, R., Mukkamala, S. (2012). Computational Intelligent Techniques and Similarity Measures for Malware Classification. In: Elizondo, D., Solanas, A., Martinez-Balleste, A. (eds) Computational Intelligence for Privacy and Security. Studies in Computational Intelligence, vol 394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25237-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-25237-2_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25236-5
Online ISBN: 978-3-642-25237-2
eBook Packages: EngineeringEngineering (R0)