Malware detection using assembly and API call sequences
- 983 Downloads
One of the major problems concerning information assurance is malicious code. To evade detection, malware has also been encrypted or obfuscated to produce variants that continue to plague properly defended and patched networks with zero day exploits. With malware and malware authors using obfuscation techniques to generate automated polymorphic and metamorphic versions, anti-virus software must always keep up with their samples and create a signature that can recognize the new variants. Creating a signature for each variant in a timely fashion is a problem that anti-virus companies face all the time. In this paper we present detection algorithms that can help the anti-virus community to ensure a variant of a known malware can still be detected without the need of creating a signature; a similarity analysis (based on specific quantitative measures) is performed to produce a matrix of similarity scores that can be utilized to determine the likelihood that a piece of code under inspection contains a particular malware. Two general malware detection methods presented in this paper are: Static Analyzer for Vicious Executables (SAVE) and Malware Examiner using Disassembled Code (MEDiC). MEDiC uses assembly calls for analysis and SAVE uses API calls (Static API call sequence and Static API call set) for analysis. We show where Assembly can be superior to API calls in that it allows a more detailed comparison of executables. API calls, on the other hand, can be superior to Assembly for its speed and its smaller signature. Our two proposed techniques are implemented in SAVE) and MEDiC. We present experimental results that indicate that both of our proposed techniques can provide a better detection performance against obfuscated malware. We also found a few false positives, such as those programs that use network functions (e.g. PuTTY) and encrypted programs (no API calls or assembly functions are found in the source code) when the thresholds are set 50% similarity measure. However, these false positives can be minimized, for example by changing the threshold value to 70% that determines whether a program falls in the malicious category or not.
KeywordsMalicious Code Obfuscation Technique Mexico Tech Virus Threshold Disassemble Code
Unable to display preview. Download preview PDF.
- 1.Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. 12th USENIX Security Symposium. http://www.cs.wisc.edu/wisa/papers/security03/cj03.pdf, August 2003
- 2.Dullien, T., Rolles, R.: Graph-based comparison of executable objects. IEEE Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA-2004), pp. 161–173Google Scholar
- 3.Hofmeyr S.A., Somayaji A., Forrest S.: Intrusion detection using sequences of system calls. J. Comput. Secur. 6, 151–180 (1998)Google Scholar
- 4.Wespi, A., Debar, H.: Building an intrusion-detection system to detect suspicious process behavior. Rec. Adv. Intrusion Detect. (1999)Google Scholar
- 5.Wespi, A., Dacier, M., Debar, H.: Intrusion Detection Using Variable-Length Audit Trail Patterns. Rec. Adv. Intrusion Detect. 110–129 (2000)Google Scholar
- 6.Kang, D.-K., Fuller, D., Honavar, V.: Learning classifiers for misuse and anomaly detection using a bag of system calls representation. Proceedings of the 6th IEEE Systems, Man, and Cybernetics Workshop (IAW 05), West Point, NY, IEEE, pp. 118–125 (2005)Google Scholar
- 7.Dictionary, http://dictionary.reference.com/search?q=obfuscation, November 2005
- 8.Collberg, C., Thomborson, C.: Watermarking, tamper-proofing, and obfuscation—tools for software protection. IEEE Transactions on Software Engineering, pp. 701–746, August 2002Google Scholar
- 9.Collberg, C., Thomborson, C., Low, D.: A Taxonomy of Obfuscating Transformations Technical Report 148, July 1997Google Scholar
- 10.Sung, A.H., Xu, J., Chavez, P., Mukkamala, S.: Static analyzer for vicious executables (SAVE). Proceedings of 20th Annual Computer Security Applications Conference (ACSAC), pp. 326–334, IEEE Computer Society Press, ISBN 0-7695-2252-1 (2004)Google Scholar