A Novel Semantic-Aware Approach for Detecting Malicious Web Traffic

  • Jing Yang
  • Liming Wang
  • Zhen Xu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10631)


With regard to web compromise, malicious web traffic refers to requests from users visiting websites for malicious targets, such as web vulnerabilities, web shells and uploaded malicious advertising web pages. To directly and comprehensively understand malicious web visits is meaningful to prevent web compromise. However, it is challenging to identify different malicious web traffic with a generic model. In this paper, a novel semantic-aware approach is proposed to detect malicious web traffic by profiling web visits individually. And a semantic representation of malicious activities is introduced to make detection results more understandable. The evaluation shows that our algorithm is effective in detecting malice with an average precision and recall of 90.8% and 92.9% respectively. Furthermore, we employ our approach on more than 136 million web traffic logs collected from a web hosting service provider, where 3,995 unique malicious IPs are detected involving hundreds of websites. The derived results reveal that our method is conductive to figure out adversaries’ intentions.


Web security Malicious web traffic Semantic analysis Unsupervised learning 



This paper is supported by the National Key R&D Program of China (2017YFB0801900).


  1. 1.
    StopBadware and CommTouch: Compromised Websites: An Owner’s Perspective.
  2. 2.
    Alrwais, S., Yuan, K., Alowaisheq, E., Liao, X., Oprea, A., Wang, X., Li, Z.: Catching predators at watering holes: finding and understanding strategically compromised websites. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp. 153–166. ACM (2016)Google Scholar
  3. 3.
    Li, F., Ho, G., Kuan, E., Niu, Y., Ballard, L., Thomas, K., Bursztein, E., Paxson, V.: Remedying web hijacking: notification effectiveness and webmaster comprehension. In: Proceedings of the 25th International Conference on World Wide Web, pp. 1009–1019. ACM (2016)Google Scholar
  4. 4.
    Xie, G., Hang, H., Faloutsos, M.: Scanner hunter: understanding http scanning traffic. In: Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, pp. 27–38. ACM (2014)Google Scholar
  5. 5.
    Kruegel, C., Vigna, G.: Anomaly detection of web-based attacks. In: Proceedings of the 10th ACM Conference on Computer and Communications Security, pp. 251–261. ACM (2003)Google Scholar
  6. 6.
    Valeur, F., Mutz, D., Vigna, G.: A learning-based approach to the detection of SQL attacks. In: Proceedings of the Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA), pp. 123–140 (2005)Google Scholar
  7. 7.
    Robertson, W., Vigna, G., Kruegel, C., Kemmerer, R.A.: Using generalization and characterization techniques in the anomaly-based detection of web attacks. In: Annual Network and Distributed System Security Symposium (NDSS) (2006)Google Scholar
  8. 8.
    Song, Y., Keromytis, A.D., Stolfo, S.J.: Spectrogram: a mixture-of-Markov-chains model for anomaly detection in web traffic. In: Annual Network and Distributed System Security Symposium (NDSS) (2009)Google Scholar
  9. 9.
    Krueger, T., Gehl, C., Rieck, K., Laskov, P.: TokDoc: a self-healing web application firewall. In: Proceedings of the 2010 ACM Symposium on Applied Computing, pp. 1846–1853. ACM (2010)Google Scholar
  10. 10.
    Lampesberger, H., Winter, P., Zeilinger, M., Hermann, E.: An on-line learning statistical model to detect malicious web requests. In: SecureComm, pp. 19–38 (2011)Google Scholar
  11. 11.
    Zhang, J., Xie, Y., Yu, F., Soukal, D., Lee, W.: Intention and origination: an inside look at large-scale bot queries. In: Annual Network and Distributed System Security Symposium (NDSS) (2013)Google Scholar
  12. 12.
    Canali, D., Balzarotti, D.: Behind the scenes of online attacks: an analysis of exploitation behaviors on the web. In: Annual Network and Distributed System Security Symposium (NDSS) (2013)Google Scholar
  13. 13.
    Starov, O., Dahse, J., Ahmad, S.S., Holz, T., Nikiforakis, N.: No honor among thieves: a large-scale analysis of malicious web shells. In: Proceedings of the 25th International Conference on World Wide Web, pp. 1021–1032. ACM (2016)Google Scholar
  14. 14.
  15. 15.
    Liao, X., Yuan, K., Wang, X., Pei, Z., Yang, H., Chen, J., Duan, H., Du, K., Alowaisheq, E., Alrwais, S., Xing, L., Beyah, R.: Seeking nonsense, looking for trouble: efficient promotional-infection detection through semantic inconsistency search. In: IEEE Symposium on Security and Privacy, pp. 707–723 (2016)Google Scholar
  16. 16.
    Paxson, V.: Bro: a system for detecting network intruders in real-time. In: Proceedings of 7th USENIX Security Symposium (1998)Google Scholar
  17. 17.
  18. 18.

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.State Key Laboratory of Information Security, Institute of Information EngineeringChinese Academy of SciencesBeijingChina
  2. 2.School of Cyber SecurityUniversity of Chinese Academy of SciencesBeijingChina

Personalised recommendations