A Real-Time Approach for Detecting Malicious Executables

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 240)

Abstract

In this paper, we develop a real-time algorithm to detect malicious portable executable (PE) files. The proposed algorithm consists of feature extraction, vector quantization, and a classifier named Attribute-Biased Classifier (ABC). We have collected a large data set of malicious PE files from the Honeynet project in the EG-CERT and VirusSign to train and test the proposed system. We first apply a feature extraction algorithm to remove redundant features. Then the most effective features are mapped into two vector quantizers. Finally, the output of the two quantizers are given to the proposed ABC classifier to identify a PE file. The results show that our algorithm is able to detect malicious PE file with 99.3% detection rate, 97% accuracy, 0.998 AUC, and less than 1% false positive rate. In addition, our algorithm consumes a fraction of seconds to test a portable executable file.

Keywords

Portable executables malicious detection data mining vector quantization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Symantec Corporation: Symantec Internet Security Threat Report. Technical report, vol. 71 (2012)Google Scholar
  2. 2.
    The UK Cyber Security Strategy: Protecting and Promoting the UK in a Digital World. Technical report (2011)Google Scholar
  3. 3.
    Zhong, Y., Yamaki, H., Takakura, H.: A Malware Classification Method based on Similarity of Function Structure. In: IEEE/IPSJ 12th International Symposium on Applications and the Internet, pp. 256–261 (2012)Google Scholar
  4. 4.
    McGraw, G.M.G.: Attacking malicious code: report to the infosec research council. IEEE Softw. 17, 33–41 (2002)CrossRefGoogle Scholar
  5. 5.
    Filiol, E.: Malware pattern scanning schemes secure against blackbox analysis. J. Comput. Virol. 2, 35–50 (2006)CrossRefGoogle Scholar
  6. 6.
    Filiol, E., Jacob, G., Liard, M.L.: Evaluation methodology and theoretical model for antiviral behavioural detection strategies. J. Comput. Virol. 3, 27–37 (2007)Google Scholar
  7. 7.
    Song, Y., Locasto, M., Stavrou, A., Keromytis, A., Stolfo, S.: On the infeasibility of modeling polymorphic shellcode. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 541–551 (2007)Google Scholar
  8. 8.
    Schultz, M., Eskin, E., Zadok, E.: Data mining methods for detection of new malicious executables. In: Proceedings of IEEE Symposium on Security and Privacy, pp. 38–49 (2001)Google Scholar
  9. 9.
    Wang, J.H., Deng, P., Fan, Y., Jaw, L., Liu, Y.: Virus detection using data mining techniques. In: Proceedings of IEEE International Conference on Data Mining (2003)Google Scholar
  10. 10.
    Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. In: Proceedings of Knowledge Discovery and Data Mining ( KDD), pp. 470–478 (2004)Google Scholar
  11. 11.
    Perdisci, R., Lanzi, A., Lee, W.: McBoost: Boosting Scalability in Malware Collection and Analysis Using Statistical Classification of Executables. In: Annual Computer Security Applications Conference (ACSAC), pp. 301–310. IEEE Press, USA (2008)Google Scholar
  12. 12.
    Ye, Y., Wang, D., Li, T., Ye, D.: IMDS: Intelligent malware detection system. In: Proccedings of ACM International Conference on Knowlege Discovery and Data Mining, SIGKDD (2007)Google Scholar
  13. 13.
    Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q.: An intelligent PE-malware detection system based on association mining. Journal in Computer Virology 4, 323–334 (2008)CrossRefGoogle Scholar
  14. 14.
  15. 15.
  16. 16.
    Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Technical report, HP Laboratories (2004)Google Scholar
  17. 17.
    Pietrek, M.: Peering Inside the PE: A Tour of the Win32 Portable Executable File Format (1994)Google Scholar
  18. 18.
    Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology 3, 185–205 (2005)CrossRefGoogle Scholar
  19. 19.
    Gray, R.M.: Vector quantization. IEEE ASSP Mag., 4–29 (1984)Google Scholar
  20. 20.
    Gersho, A., Gray, R.M.: Vector quantization and signal compression. Kluwer Academic Publishers (1991)Google Scholar
  21. 21.
    Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 84–95 (1980)CrossRefGoogle Scholar
  22. 22.
    Specht, D.F.: Probabilistic Neural Networks for Classification, Mapping, or Associative Memory. In: IEEE International Conference on Neural Networks, vol. I, pp. 525–532 (1998)Google Scholar
  23. 23.
    Marcoa, V.R., Younga, D.M., Turnerb, D.W.: The Euclidean distance classifier: an alternative to the linear discriminant function. Communications in Statistics - Simulation and Computation 16, 485–505 (1987)CrossRefGoogle Scholar
  24. 24.
    AVG Antivirus, http://free.avg.com/
  25. 25.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Samir Sayed
    • 1
    • 3
    • 4
  • Rania R. Darwish
    • 2
  • Sameh A. Salem
    • 1
  1. 1.Department of Electronics, Communications, and Computers EngineeringHelwan UniversityCairoEgypt
  2. 2.Department of Mechanical Engineering, MechatronicsHelwan UniversityCairoEgypt
  3. 3.Department of Electronic and Electrical EngineeringUniversity College LondonLondonUK
  4. 4.Egyptian Computer Emergency Response Team (EG-CERT)National Telecom Regulatory Authority (NTRA)CairoEgypt

Personalised recommendations