Advances in Systems Science pp 355-364 | Cite as
A Real-Time Approach for Detecting Malicious Executables
Abstract
In this paper, we develop a real-time algorithm to detect malicious portable executable (PE) files. The proposed algorithm consists of feature extraction, vector quantization, and a classifier named Attribute-Biased Classifier (ABC). We have collected a large data set of malicious PE files from the Honeynet project in the EG-CERT and VirusSign to train and test the proposed system. We first apply a feature extraction algorithm to remove redundant features. Then the most effective features are mapped into two vector quantizers. Finally, the output of the two quantizers are given to the proposed ABC classifier to identify a PE file. The results show that our algorithm is able to detect malicious PE file with 99.3% detection rate, 97% accuracy, 0.998 AUC, and less than 1% false positive rate. In addition, our algorithm consumes a fraction of seconds to test a portable executable file.
Keywords
Portable executables malicious detection data mining vector quantizationPreview
Unable to display preview. Download preview PDF.
References
- 1.Symantec Corporation: Symantec Internet Security Threat Report. Technical report, vol. 71 (2012)Google Scholar
- 2.The UK Cyber Security Strategy: Protecting and Promoting the UK in a Digital World. Technical report (2011)Google Scholar
- 3.Zhong, Y., Yamaki, H., Takakura, H.: A Malware Classification Method based on Similarity of Function Structure. In: IEEE/IPSJ 12th International Symposium on Applications and the Internet, pp. 256–261 (2012)Google Scholar
- 4.McGraw, G.M.G.: Attacking malicious code: report to the infosec research council. IEEE Softw. 17, 33–41 (2002)CrossRefGoogle Scholar
- 5.Filiol, E.: Malware pattern scanning schemes secure against blackbox analysis. J. Comput. Virol. 2, 35–50 (2006)CrossRefGoogle Scholar
- 6.Filiol, E., Jacob, G., Liard, M.L.: Evaluation methodology and theoretical model for antiviral behavioural detection strategies. J. Comput. Virol. 3, 27–37 (2007)Google Scholar
- 7.Song, Y., Locasto, M., Stavrou, A., Keromytis, A., Stolfo, S.: On the infeasibility of modeling polymorphic shellcode. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 541–551 (2007)Google Scholar
- 8.Schultz, M., Eskin, E., Zadok, E.: Data mining methods for detection of new malicious executables. In: Proceedings of IEEE Symposium on Security and Privacy, pp. 38–49 (2001)Google Scholar
- 9.Wang, J.H., Deng, P., Fan, Y., Jaw, L., Liu, Y.: Virus detection using data mining techniques. In: Proceedings of IEEE International Conference on Data Mining (2003)Google Scholar
- 10.Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. In: Proceedings of Knowledge Discovery and Data Mining ( KDD), pp. 470–478 (2004)Google Scholar
- 11.Perdisci, R., Lanzi, A., Lee, W.: McBoost: Boosting Scalability in Malware Collection and Analysis Using Statistical Classification of Executables. In: Annual Computer Security Applications Conference (ACSAC), pp. 301–310. IEEE Press, USA (2008)Google Scholar
- 12.Ye, Y., Wang, D., Li, T., Ye, D.: IMDS: Intelligent malware detection system. In: Proccedings of ACM International Conference on Knowlege Discovery and Data Mining, SIGKDD (2007)Google Scholar
- 13.Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q.: An intelligent PE-malware detection system based on association mining. Journal in Computer Virology 4, 323–334 (2008)CrossRefGoogle Scholar
- 14.EG-CERT, http://www.egcert.eg/cert/
- 15.VirusSign, http://freelist.virussign.com/freelist
- 16.Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Technical report, HP Laboratories (2004)Google Scholar
- 17.Pietrek, M.: Peering Inside the PE: A Tour of the Win32 Portable Executable File Format (1994)Google Scholar
- 18.Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology 3, 185–205 (2005)CrossRefGoogle Scholar
- 19.Gray, R.M.: Vector quantization. IEEE ASSP Mag., 4–29 (1984)Google Scholar
- 20.Gersho, A., Gray, R.M.: Vector quantization and signal compression. Kluwer Academic Publishers (1991)Google Scholar
- 21.Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 84–95 (1980)CrossRefGoogle Scholar
- 22.Specht, D.F.: Probabilistic Neural Networks for Classification, Mapping, or Associative Memory. In: IEEE International Conference on Neural Networks, vol. I, pp. 525–532 (1998)Google Scholar
- 23.Marcoa, V.R., Younga, D.M., Turnerb, D.W.: The Euclidean distance classifier: an alternative to the linear discriminant function. Communications in Statistics - Simulation and Computation 16, 485–505 (1987)CrossRefGoogle Scholar
- 24.AVG Antivirus, http://free.avg.com/
- 25.Panda Antivirus, http://www.pandasecurity.com/