Skip to main content

Detection of Malicious HTTP Requests Using Header and URL Features

  • Conference paper
  • First Online:
Proceedings of the Future Technologies Conference (FTC) 2020, Volume 2 (FTC 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1289))

Included in the following conference series:

Abstract

Cyber attackers leverage the openness of internet traffic to send specially crafted HyperText Transfer Protocol (HTTP) requests and launch sophisticated attacks for a myriad of purposes including disruption of service, illegal financial gain, and alteration or destruction of confidential medical or personal data. Detection of malicious HTTP requests is therefore essential to counter and prevent web attacks. In this work, we collected web traffic data and used HTTP request header features with supervised machine learning techniques to predict whether a message is likely to be malicious or benign. Our analysis was based on two real world datasets: one collected over a period of 42 days from a low interaction honeypot deployed on a Comcast business class network, and the other collected from a university web server for a similar duration. In our analysis, we observed that: (1) benign and malicious requests differ with respect to their header usage, (2) three specific HTTP headers (i.e., accept-encoding, accept-language, and content-type) can be used to efficiently classify a request as benign or malicious with 93.6% accuracy, (3) HTTP request line lengths of benign and malicious requests differ, (4) HTTP request line length can be used to efficiently classify a request as benign or malicious with 96.9% accuracy. This implies we can use a relatively simple predictive model with a fast classification time to efficiently and accurately filter out malicious web traffic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Calzavara, S., Conti, M., Focardi, R., Rabitti, A., Tolomei, G.: Machine learning for web vulnerability detection: the case of cross-site request forgery. In: IEEE Security & Privacy, January 2019

    Google Scholar 

  2. Calzavara, S., Focardi, R., Squarcina, M., Tempesta, M.: Surviving the web: a journey into web session security. ACM Comput. Surv. 5, 451–455 (2018)

    Google Scholar 

  3. Khalid, M., Farooq, H., Iqbal, M., Alam, M.T., Rasheed, K.: Predicting web vulnerabilities in web applications based on machine learning. In: Presented at Intelligent Technologies and Applications. Communications in Computer and Information Science, vol. 932. Springer, Singapore, March 2019

    Google Scholar 

  4. https://owasp.org/www-project-top-ten/. Accessed 15 Mar 2020

  5. https://www.accenture.com/acnmedia/PDF-99/Accenture-Cost-Cyber-Crime-Infographic.pdf#zoom=50. Accessed 15 Mar 2020

  6. https://www.pentasecurity.com/blog/top-5-botnets-2017/. Accessed 15 Mar 2020

  7. https://securelist.com/bots-and-botnets-in-2018/90091/. Accessed 15 Mar 2020

  8. Putman, C., Abhishta, A., Nieuwenhuis, B.: Business model of a botnet. In: Presented at Proceedings of the 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP, April 2018

    Google Scholar 

  9. Omari, S., Mescioglu, I., Rajshree, S.: Experiences on the deployment of honeypots for collection and analysis of web attacks. In: Presented at MBAA International Conference, June 2016

    Google Scholar 

  10. https://www.honeynet.org/projects/old/glastopf/. Accessed 16 Mar 2020

  11. https://www.lewisu.edu/academics/comsci/. Accessed 16 Mar 2020

  12. https://www.python.org/. Accessed 16 Mar 2020

  13. https://www.anaconda.com/. Accessed 16 Mar 2020

  14. Niu, W., Li, T., Zhang, X., Hu, T., Jiang, T., Wu, H.: Using XGBoost to discover infected hosts based on HTTP traffic. In: Security/Communication Networks, pp. 1–11 (2019)

    Google Scholar 

  15. Zhang, M., Xu, B., Bai, S., Lu, S., Lin, Z.: A deep learning method to detect web attacks using a specially designed CNN. In: Presented at 24th International Conference on Neural Information Processing. Proceedings Part V, LNCS, vol. 10638, pp. 828–836, October 2017

    Google Scholar 

  16. Zhang, Y., Mekky, H., Zhang, Z., Torres, R., Lee, S., Tongaonkar, A., Mellia, M.: Detecting malicious activities with user-agent-based profiles. Int. J. Netw. Manage. 25(5) (2015)

    Google Scholar 

  17. Yu, Y., Yan, H., Guan, H., Zhou, H.: DeepHTTP: semantics-structure model with attention for anomalous HTTP traffic detection and pattern mining. In: Proceedings of ACSAC, New York, NY, USA (2018)

    Google Scholar 

  18. Goseva-Popstojanova, K., Anastasovski, G., Dimitrijevikj, A., Pantev, R., Miller, B.: Characterization and classification of malicious web traffic. Comput. Secur. 42, 92–115 (2014)

    Google Scholar 

  19. Li, K., Chen, R., Gu, L., Liu, C., Yin, J.: A method based on statistical characteristics for detection malware requests in network traffic. In: Presented at IEEE Third International Conference on Data Science in Cyberspace. pp. 527–532, June 2018

    Google Scholar 

  20. Yong, B., Xin, L., Qingchen, Y., Liang, H., Qingguo, Z.: Malicious web traffic detection for internet of things environments. Comput. Electr. Eng. 77, 260–272 (2019)

    Google Scholar 

  21. Ogawa, H., Yamaguchi, Y., Shimada, H., Takakura, H., Akiyama, M., Yagi, T.: Malware originated HTTP traffic detection utilizing cluster appearance ratio. In: Presented at International Conference on Information Networking (ICOIN), pp. 248–253, January 2017

    Google Scholar 

  22. Zarras, A., Papadogiannakis, A., Gawlik, R., Holz, T.: Automated generation of models for fast and precise detection of HTTP-based malware. In: Presented at Annual Conference on Privacy, Security and Trust, PST 2014, pp. 249–256, July 2014

    Google Scholar 

  23. Kheir, N.: Behavioral classification and detection of malware through HTTP user agent anomalies. J. Inf. Secur. Appl. 18, 2–13 (2013)

    Google Scholar 

  24. McGahagan, J., Bhansali, D., Gratian, M., Cukier, M.: A comprehensive evaluation of HTTP header features for detecting malicious websites. In: Presented at European Dependable Computing Conference, pp. 75–82, September 2019

    Google Scholar 

  25. Rovetta, S., Suchacka, G., Masulli, F.: Bot recognition in a web store: an approach based on unsupervised learning. J. Netw. Comput. Appl. 157 (2020)

    Google Scholar 

  26. Seyyar, M., Catak, F., Gul, E.: Detection of attack-targeted scans from the apache HTTP server access logs. Appl. Comput. Inf. 14, 28–36 (2017)

    Google Scholar 

  27. Husák, M., Velan, P., Vykopal, J.: Security monitoring of HTTP traffic using extended flows. In: Presented at 10th International Conference on Availability, Reliability and Security, pp. 258–265, August 2015

    Google Scholar 

  28. Zolotukhin, M., Hamalainen, T., Kokkonen, T., Siltanen, J.: Analysis of HTTP requests for anomaly detection of web attacks. In: Presented at International Conference on Dependable, Autonomic and Secure Computing, pp. 406–411, August 2014

    Google Scholar 

  29. https://www.nubeva.com/product. Accessed 22 Mar 2020

  30. Geron, A.: Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd edn. OReilly Media, Inc., Sebastopol (2019)

    Google Scholar 

  31. Raschka, S., Mirjalili, V.: Python Machine Learning. 3rd edn. Packt Publishing, Birmingham (2019)

    Google Scholar 

  32. https://scikit-learn.org/stable/supervised_learning.html#supervised-learning. Accessed 31 Mar 2020

  33. https://xgboost.readthedocs.io/en/latest/. Accessed 31 Mar 2020

  34. Keras Homepage. https://keras.io/. Accessed 31 Mar 2020

  35. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Presented at International Conference on Learning Representations, January 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Safwan Omari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Laughter, A., Omari, S., Szczurek, P., Perry, J. (2021). Detection of Malicious HTTP Requests Using Header and URL Features. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Proceedings of the Future Technologies Conference (FTC) 2020, Volume 2 . FTC 2020. Advances in Intelligent Systems and Computing, vol 1289. Springer, Cham. https://doi.org/10.1007/978-3-030-63089-8_29

Download citation

Publish with us

Policies and ethics