Applying Machine Learning Techniques for Detection of Malicious Code in Network Traffic

  • Yuval Elovici
  • Asaf Shabtai
  • Robert Moskovitch
  • Gil Tahan
  • Chanan Glezer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4667)


The Early Detection, Alert and Response (eDare) system is aimed at purifying Web traffic propagating via the premises of Network Service Providers (NSP) from malicious code. To achieve this goal, the system employs powerful network traffic scanners capable of cleaning traffic from known malicious code. The remaining traffic is monitored and Machine Learning (ML) algorithms are invoked in an attempt to pinpoint unknown malicious code exhibiting suspicious morphological patterns. Decision trees, Neural Networks and Bayesian Networks are used for static code analysis in order to determine whether a suspicious executable file actually inhabits malicious code. These algorithms are being evaluated and preliminary results are encouraging.


Malicious Code Machine Learning Network Service Provider (NSP) Feature Selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Symantec Internet Security Threat Report (January-June 2004),
  3. 3.
    The Danger of Spyware, Symantec Security Response (June 2003),
  4. 4.
  5. 5.
    Tahan, G., Glezer, C., Elovici, Y.: eDare- Early Detection Alert and Response to Electronic Threats, Working Paper, Deutsche Telekom Labs at Ben Gurion UniversityGoogle Scholar
  6. 6.
    Schultz, M., Eskin, E., Zadok, E., Stolfo, S.: Data Mining Methods for Detection of New Malicious Executables. In: Proc. of the IEEE Symposium on Security and Privacy, pp. 178–184 (2001)Google Scholar
  7. 7.
    Abou-Assaleh, T., Cercone, N., Keselj, V., Sweidan, R.: N-gram based Detection of New Malicious Code. In: COMPSAC 2004. Proc. of the 28th Annual International Computer Software and Applications Conference (2004)Google Scholar
  8. 8.
    Kolter, J.Z., Maloof, M.A.: Learning to detect malicious executables in the wild. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470–478. ACM Press, New York, NY (2004)CrossRefGoogle Scholar
  9. 9.
    Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  10. 10.
  11. 11.
    Pearl, J.: Fusion, propagation, and structuring in belief networks. Artificial Intelligence 29(3), 241–288 (1986)MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Bishop, C.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)Google Scholar
  13. 13.
    Demuth, H., Beale, M.: Neural Network toolbox for use with Matlab. The Mathworks Inc., Natick, MA (1998)Google Scholar
  14. 14.
    Golub, T., Slonim, D., Tamaya, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  15. 15.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification Algorithms. Bagging, Boosting, and Variants. Machine Learning 35, 1–38 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Yuval Elovici
    • 1
  • Asaf Shabtai
    • 1
  • Robert Moskovitch
    • 1
  • Gil Tahan
    • 1
  • Chanan Glezer
    • 1
  1. 1.Deutsche Telekom Laboratories at Ben-Gurion University, Be’er Sheva, 84105Israel

Personalised recommendations