Abstract
Constructing an efficient malware detection system requires taking into consideration two important aspects, which are the accuracy and the detection time. However, finding an appropriate balance between these two characteristics remains at this time a very challenging problem. In this paper, we present a real-time PE (Portable Executable) malware detection system, which is based on the analysis of the information stored in the PE-Optional Header fields (PEF). Our system used a combination of the Chi-square (KHI2) score and the Phi (ϕ) coefficient as feature selection method. We have evaluated our system using Rotation Forest classifier implemented in WEKA and we reached more than 97% of accuracy. Our system is able to categorize a file in 0.077 seconds, which makes it adequate for real-time detection of malware.
Chapter PDF
Similar content being viewed by others
References
Bazrafshan, Z., Hashemi, H., Fard, S.M.H., Hamzeh, A.: A survey on heuristic malware detection techniques. In: Proceedings 2013 5th Conference on Information and Knowledge Technology (IKT), Shiraz, pp. 113–120 (2013)
Ye, Y., Li, T., Jiang, Q., Wang, Y.: CIMDS: Adapting postprocessing techniques of associative classification for malware detection. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews 40, 298–307 (2010)
June, I.: Anti-malware vendors slow to respond. Computer Fraud & Security, 1–2 (2010)
Salehi, Z., Sami, A., Ghiasi, M.: Using feature generation from API calls for malware detection. Computer Fraud & Security Bulletin, 9–18 (2014)
Aycock, J.D.: Computer viruses and malware. Springer, Heidelberg (2006)
Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey. Information Security Technical Report 14, 16–29 (2009)
Fornasini, P.: The Chi Square test. The Uncertainty in Physical Measurements:An Introduction to Data Analysis in the Physics Laboratory, pp. 187–198. Springer Science & Business Media (2009)
Rodríguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: A New classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 1619–1630 (2006)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Pietrek, M.: Peering Inside the PE: A Tour of the Win32 Portable Executable File Format. Microsoft Systems Journal-US Edition 9, 15–38 (1994)
Schultz, M.G., Eskin, E., Zadok, E., Stolfo, S.J.: Data mining methods for detection of new malicious executables. Proceedings. In: 2001 IEEE Symposium on Security and Privacy, S&P 2001, Oakland, CA, pp. 38–49 (2001)
Wang, C., Pang, J., Zhao, R., Liu, X.: Using API sequence and bayes algorithm to detect suspicious behavior. In: Proceedings of the 2009 International Conference on Communication Software and Networks, ICCSN 2009, Macau, pp. 544–548 (2009)
Koskska, S., Nevison, C.: Statistical tables and formulae. Springer, New York (1989)
Farrington, D.P., Loeber, R.: Relative improvement over chance (RIOC) and phi as measures of predictive efficiency and strength of association in 2x2 tables. Journal of Quantitative Criminology 5, 201–213 (1989)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 IFIP International Federation for Information Processing
About this paper
Cite this paper
Belaoued, M., Mazouzi, S. (2015). A Real-Time PE-Malware Detection System Based on CHI-Square Test and PE-File Features. In: Amine, A., Bellatreche, L., Elberrichi, Z., Neuhold, E., Wrembel, R. (eds) Computer Science and Its Applications. CIIA 2015. IFIP Advances in Information and Communication Technology, vol 456. Springer, Cham. https://doi.org/10.1007/978-3-319-19578-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-19578-0_34
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19577-3
Online ISBN: 978-3-319-19578-0
eBook Packages: Computer ScienceComputer Science (R0)