Detection of Malicious HTTP Requests Using Header and URL Features

Laughter, Ashley; Omari, Safwan; Szczurek, Piotr; Perry, Jason

doi:10.1007/978-3-030-63089-8_29

Ashley Laughter¹⁷,
Safwan Omari¹⁷,
Piotr Szczurek¹⁷ &
…
Jason Perry¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1289))

Included in the following conference series:

Proceedings of the Future Technologies Conference

1460 Accesses
1 Citations
3 Altmetric

Abstract

Cyber attackers leverage the openness of internet traffic to send specially crafted HyperText Transfer Protocol (HTTP) requests and launch sophisticated attacks for a myriad of purposes including disruption of service, illegal financial gain, and alteration or destruction of confidential medical or personal data. Detection of malicious HTTP requests is therefore essential to counter and prevent web attacks. In this work, we collected web traffic data and used HTTP request header features with supervised machine learning techniques to predict whether a message is likely to be malicious or benign. Our analysis was based on two real world datasets: one collected over a period of 42 days from a low interaction honeypot deployed on a Comcast business class network, and the other collected from a university web server for a similar duration. In our analysis, we observed that: (1) benign and malicious requests differ with respect to their header usage, (2) three specific HTTP headers (i.e., accept-encoding, accept-language, and content-type) can be used to efficiently classify a request as benign or malicious with 93.6% accuracy, (3) HTTP request line lengths of benign and malicious requests differ, (4) HTTP request line length can be used to efficiently classify a request as benign or malicious with 96.9% accuracy. This implies we can use a relatively simple predictive model with a fast classification time to efficiently and accurately filter out malicious web traffic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Calzavara, S., Conti, M., Focardi, R., Rabitti, A., Tolomei, G.: Machine learning for web vulnerability detection: the case of cross-site request forgery. In: IEEE Security & Privacy, January 2019
Google Scholar
Calzavara, S., Focardi, R., Squarcina, M., Tempesta, M.: Surviving the web: a journey into web session security. ACM Comput. Surv. 5, 451–455 (2018)
Google Scholar
Khalid, M., Farooq, H., Iqbal, M., Alam, M.T., Rasheed, K.: Predicting web vulnerabilities in web applications based on machine learning. In: Presented at Intelligent Technologies and Applications. Communications in Computer and Information Science, vol. 932. Springer, Singapore, March 2019
Google Scholar
https://owasp.org/www-project-top-ten/. Accessed 15 Mar 2020
https://www.accenture.com/acnmedia/PDF-99/Accenture-Cost-Cyber-Crime-Infographic.pdf#zoom=50. Accessed 15 Mar 2020
https://www.pentasecurity.com/blog/top-5-botnets-2017/. Accessed 15 Mar 2020
https://securelist.com/bots-and-botnets-in-2018/90091/. Accessed 15 Mar 2020
Putman, C., Abhishta, A., Nieuwenhuis, B.: Business model of a botnet. In: Presented at Proceedings of the 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP, April 2018
Google Scholar
Omari, S., Mescioglu, I., Rajshree, S.: Experiences on the deployment of honeypots for collection and analysis of web attacks. In: Presented at MBAA International Conference, June 2016
Google Scholar
https://www.honeynet.org/projects/old/glastopf/. Accessed 16 Mar 2020
https://www.lewisu.edu/academics/comsci/. Accessed 16 Mar 2020
https://www.python.org/. Accessed 16 Mar 2020
https://www.anaconda.com/. Accessed 16 Mar 2020
Niu, W., Li, T., Zhang, X., Hu, T., Jiang, T., Wu, H.: Using XGBoost to discover infected hosts based on HTTP traffic. In: Security/Communication Networks, pp. 1–11 (2019)
Google Scholar
Zhang, M., Xu, B., Bai, S., Lu, S., Lin, Z.: A deep learning method to detect web attacks using a specially designed CNN. In: Presented at 24th International Conference on Neural Information Processing. Proceedings Part V, LNCS, vol. 10638, pp. 828–836, October 2017
Google Scholar
Zhang, Y., Mekky, H., Zhang, Z., Torres, R., Lee, S., Tongaonkar, A., Mellia, M.: Detecting malicious activities with user-agent-based profiles. Int. J. Netw. Manage. 25(5) (2015)
Google Scholar
Yu, Y., Yan, H., Guan, H., Zhou, H.: DeepHTTP: semantics-structure model with attention for anomalous HTTP traffic detection and pattern mining. In: Proceedings of ACSAC, New York, NY, USA (2018)
Google Scholar
Goseva-Popstojanova, K., Anastasovski, G., Dimitrijevikj, A., Pantev, R., Miller, B.: Characterization and classification of malicious web traffic. Comput. Secur. 42, 92–115 (2014)
Google Scholar
Li, K., Chen, R., Gu, L., Liu, C., Yin, J.: A method based on statistical characteristics for detection malware requests in network traffic. In: Presented at IEEE Third International Conference on Data Science in Cyberspace. pp. 527–532, June 2018
Google Scholar
Yong, B., Xin, L., Qingchen, Y., Liang, H., Qingguo, Z.: Malicious web traffic detection for internet of things environments. Comput. Electr. Eng. 77, 260–272 (2019)
Google Scholar
Ogawa, H., Yamaguchi, Y., Shimada, H., Takakura, H., Akiyama, M., Yagi, T.: Malware originated HTTP traffic detection utilizing cluster appearance ratio. In: Presented at International Conference on Information Networking (ICOIN), pp. 248–253, January 2017
Google Scholar
Zarras, A., Papadogiannakis, A., Gawlik, R., Holz, T.: Automated generation of models for fast and precise detection of HTTP-based malware. In: Presented at Annual Conference on Privacy, Security and Trust, PST 2014, pp. 249–256, July 2014
Google Scholar
Kheir, N.: Behavioral classification and detection of malware through HTTP user agent anomalies. J. Inf. Secur. Appl. 18, 2–13 (2013)
Google Scholar
McGahagan, J., Bhansali, D., Gratian, M., Cukier, M.: A comprehensive evaluation of HTTP header features for detecting malicious websites. In: Presented at European Dependable Computing Conference, pp. 75–82, September 2019
Google Scholar
Rovetta, S., Suchacka, G., Masulli, F.: Bot recognition in a web store: an approach based on unsupervised learning. J. Netw. Comput. Appl. 157 (2020)
Google Scholar
Seyyar, M., Catak, F., Gul, E.: Detection of attack-targeted scans from the apache HTTP server access logs. Appl. Comput. Inf. 14, 28–36 (2017)
Google Scholar
Husák, M., Velan, P., Vykopal, J.: Security monitoring of HTTP traffic using extended flows. In: Presented at 10th International Conference on Availability, Reliability and Security, pp. 258–265, August 2015
Google Scholar
Zolotukhin, M., Hamalainen, T., Kokkonen, T., Siltanen, J.: Analysis of HTTP requests for anomaly detection of web attacks. In: Presented at International Conference on Dependable, Autonomic and Secure Computing, pp. 406–411, August 2014
Google Scholar
https://www.nubeva.com/product. Accessed 22 Mar 2020
Geron, A.: Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd edn. OReilly Media, Inc., Sebastopol (2019)
Google Scholar
Raschka, S., Mirjalili, V.: Python Machine Learning. 3rd edn. Packt Publishing, Birmingham (2019)
Google Scholar
https://scikit-learn.org/stable/supervised_learning.html#supervised-learning. Accessed 31 Mar 2020
https://xgboost.readthedocs.io/en/latest/. Accessed 31 Mar 2020
Keras Homepage. https://keras.io/. Accessed 31 Mar 2020
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Presented at International Conference on Learning Representations, January 2015
Google Scholar

Download references

Author information

Authors and Affiliations

Lewis University, Romeoville, IL, 60446, USA
Ashley Laughter, Safwan Omari, Piotr Szczurek & Jason Perry

Authors

Ashley Laughter
View author publications
You can also search for this author in PubMed Google Scholar
Safwan Omari
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Szczurek
View author publications
You can also search for this author in PubMed Google Scholar
Jason Perry
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Safwan Omari .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laughter, A., Omari, S., Szczurek, P., Perry, J. (2021). Detection of Malicious HTTP Requests Using Header and URL Features. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Proceedings of the Future Technologies Conference (FTC) 2020, Volume 2 . FTC 2020. Advances in Intelligent Systems and Computing, vol 1289. Springer, Cham. https://doi.org/10.1007/978-3-030-63089-8_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-63089-8_29
Published: 01 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63088-1
Online ISBN: 978-3-030-63089-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics