Abstract
Cyber attackers leverage the openness of internet traffic to send specially crafted HyperText Transfer Protocol (HTTP) requests and launch sophisticated attacks for a myriad of purposes including disruption of service, illegal financial gain, and alteration or destruction of confidential medical or personal data. Detection of malicious HTTP requests is therefore essential to counter and prevent web attacks. In this work, we collected web traffic data and used HTTP request header features with supervised machine learning techniques to predict whether a message is likely to be malicious or benign. Our analysis was based on two real world datasets: one collected over a period of 42 days from a low interaction honeypot deployed on a Comcast business class network, and the other collected from a university web server for a similar duration. In our analysis, we observed that: (1) benign and malicious requests differ with respect to their header usage, (2) three specific HTTP headers (i.e., accept-encoding, accept-language, and content-type) can be used to efficiently classify a request as benign or malicious with 93.6% accuracy, (3) HTTP request line lengths of benign and malicious requests differ, (4) HTTP request line length can be used to efficiently classify a request as benign or malicious with 96.9% accuracy. This implies we can use a relatively simple predictive model with a fast classification time to efficiently and accurately filter out malicious web traffic.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Calzavara, S., Conti, M., Focardi, R., Rabitti, A., Tolomei, G.: Machine learning for web vulnerability detection: the case of cross-site request forgery. In: IEEE Security & Privacy, January 2019
Calzavara, S., Focardi, R., Squarcina, M., Tempesta, M.: Surviving the web: a journey into web session security. ACM Comput. Surv. 5, 451–455 (2018)
Khalid, M., Farooq, H., Iqbal, M., Alam, M.T., Rasheed, K.: Predicting web vulnerabilities in web applications based on machine learning. In: Presented at Intelligent Technologies and Applications. Communications in Computer and Information Science, vol. 932. Springer, Singapore, March 2019
https://owasp.org/www-project-top-ten/. Accessed 15 Mar 2020
https://www.accenture.com/acnmedia/PDF-99/Accenture-Cost-Cyber-Crime-Infographic.pdf#zoom=50. Accessed 15 Mar 2020
https://www.pentasecurity.com/blog/top-5-botnets-2017/. Accessed 15 Mar 2020
https://securelist.com/bots-and-botnets-in-2018/90091/. Accessed 15 Mar 2020
Putman, C., Abhishta, A., Nieuwenhuis, B.: Business model of a botnet. In: Presented at Proceedings of the 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP, April 2018
Omari, S., Mescioglu, I., Rajshree, S.: Experiences on the deployment of honeypots for collection and analysis of web attacks. In: Presented at MBAA International Conference, June 2016
https://www.honeynet.org/projects/old/glastopf/. Accessed 16 Mar 2020
https://www.lewisu.edu/academics/comsci/. Accessed 16 Mar 2020
https://www.python.org/. Accessed 16 Mar 2020
https://www.anaconda.com/. Accessed 16 Mar 2020
Niu, W., Li, T., Zhang, X., Hu, T., Jiang, T., Wu, H.: Using XGBoost to discover infected hosts based on HTTP traffic. In: Security/Communication Networks, pp. 1–11 (2019)
Zhang, M., Xu, B., Bai, S., Lu, S., Lin, Z.: A deep learning method to detect web attacks using a specially designed CNN. In: Presented at 24th International Conference on Neural Information Processing. Proceedings Part V, LNCS, vol. 10638, pp. 828–836, October 2017
Zhang, Y., Mekky, H., Zhang, Z., Torres, R., Lee, S., Tongaonkar, A., Mellia, M.: Detecting malicious activities with user-agent-based profiles. Int. J. Netw. Manage. 25(5) (2015)
Yu, Y., Yan, H., Guan, H., Zhou, H.: DeepHTTP: semantics-structure model with attention for anomalous HTTP traffic detection and pattern mining. In: Proceedings of ACSAC, New York, NY, USA (2018)
Goseva-Popstojanova, K., Anastasovski, G., Dimitrijevikj, A., Pantev, R., Miller, B.: Characterization and classification of malicious web traffic. Comput. Secur. 42, 92–115 (2014)
Li, K., Chen, R., Gu, L., Liu, C., Yin, J.: A method based on statistical characteristics for detection malware requests in network traffic. In: Presented at IEEE Third International Conference on Data Science in Cyberspace. pp. 527–532, June 2018
Yong, B., Xin, L., Qingchen, Y., Liang, H., Qingguo, Z.: Malicious web traffic detection for internet of things environments. Comput. Electr. Eng. 77, 260–272 (2019)
Ogawa, H., Yamaguchi, Y., Shimada, H., Takakura, H., Akiyama, M., Yagi, T.: Malware originated HTTP traffic detection utilizing cluster appearance ratio. In: Presented at International Conference on Information Networking (ICOIN), pp. 248–253, January 2017
Zarras, A., Papadogiannakis, A., Gawlik, R., Holz, T.: Automated generation of models for fast and precise detection of HTTP-based malware. In: Presented at Annual Conference on Privacy, Security and Trust, PST 2014, pp. 249–256, July 2014
Kheir, N.: Behavioral classification and detection of malware through HTTP user agent anomalies. J. Inf. Secur. Appl. 18, 2–13 (2013)
McGahagan, J., Bhansali, D., Gratian, M., Cukier, M.: A comprehensive evaluation of HTTP header features for detecting malicious websites. In: Presented at European Dependable Computing Conference, pp. 75–82, September 2019
Rovetta, S., Suchacka, G., Masulli, F.: Bot recognition in a web store: an approach based on unsupervised learning. J. Netw. Comput. Appl. 157 (2020)
Seyyar, M., Catak, F., Gul, E.: Detection of attack-targeted scans from the apache HTTP server access logs. Appl. Comput. Inf. 14, 28–36 (2017)
Husák, M., Velan, P., Vykopal, J.: Security monitoring of HTTP traffic using extended flows. In: Presented at 10th International Conference on Availability, Reliability and Security, pp. 258–265, August 2015
Zolotukhin, M., Hamalainen, T., Kokkonen, T., Siltanen, J.: Analysis of HTTP requests for anomaly detection of web attacks. In: Presented at International Conference on Dependable, Autonomic and Secure Computing, pp. 406–411, August 2014
https://www.nubeva.com/product. Accessed 22 Mar 2020
Geron, A.: Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd edn. OReilly Media, Inc., Sebastopol (2019)
Raschka, S., Mirjalili, V.: Python Machine Learning. 3rd edn. Packt Publishing, Birmingham (2019)
https://scikit-learn.org/stable/supervised_learning.html#supervised-learning. Accessed 31 Mar 2020
https://xgboost.readthedocs.io/en/latest/. Accessed 31 Mar 2020
Keras Homepage. https://keras.io/. Accessed 31 Mar 2020
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Presented at International Conference on Learning Representations, January 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Laughter, A., Omari, S., Szczurek, P., Perry, J. (2021). Detection of Malicious HTTP Requests Using Header and URL Features. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Proceedings of the Future Technologies Conference (FTC) 2020, Volume 2 . FTC 2020. Advances in Intelligent Systems and Computing, vol 1289. Springer, Cham. https://doi.org/10.1007/978-3-030-63089-8_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-63089-8_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63088-1
Online ISBN: 978-3-030-63089-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)