Abstract
In recent years, various machine learning, deep learning based models have been developed to detect novel web attacks. These models are mostly use NLP methods, like N-gram, word-embedding, to process URLs as the general strings composed of characters. In contrast to natural language which consist of words, the URL is composed of characters and hardly decomposes into several meaning segments. In fact, HTTP requests have its inherent patterns, which so-called semantic structure, such as the request bodies have fixed type, request parameters have fixed structure in names and orders, values of these parameters also have special semantics such as username, password, page id, commodity id. These methods have no mechanism to learn semantic structure. They roughly use NLP techniques like DFA, attention techniques to learn normal patterns from dataset. And, they also need a mount of dataset to train. In this paper, we propose a novel web anomaly detection approach based on semantic structure. Firstly, a hierarchical method is proposed to automatically learn semantic structure from training dataset. Then, we learn normal profile for each parameter. The experimental results showed that our approach achieved a high precision rate of 99.29% while maintaining a low false alarm rate of 0.88%. Moreover, even on a small training dataset composed of hundreds of samples, we also achieved 96.3% accuracy rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adhyaru, R.P.: Techniques for attacking web application security. Int. J. Inf. 6(1/2), (2016)
Alonso, G., Casati, F., Kuno, H., Machiraju, V.: Web services. In: Web Services, pp. 123–149. Springer (2004). https://doi.org/10.1007/978-3-662-10876-5_5
Ariu, D., Tronci, R., Giacinto, G.: HMMPayl an intrusion detection system based on hidden Markov models. Comput. Secur. 30(4), 221–241 (2011)
Cho, S., Cha, S.: Sad: web session anomaly detection based on parameter estimation. Comput. Secur. 23(4), 312–319 (2004)
Corona, I., Ariu, D., Giacinto, G.: Hmm-web: a framework for the detection of attacks against web applications. In: 2009 IEEE International Conference on Communications, pp. 1–6. IEEE (2009)
Cui, B., He, S., Yao, X., Shi, P.: Malicious URL detection with feature extraction based on machine learning. Int. J. High Perform. Comput. Netw. 12(2), 166–178 (2018)
Cui, M., Hu, S.: Search engine optimization research for website promotion. In: 2011 International Conference of Information Technology, Computer Engineering and Management Sciences, vol. 4, pp. 100–103. IEEE (2011)
Denning, D.E.: An intrusion-detection model. IEEE Trans. Software Eng. 2, 222–232 (1987)
Fan, W.K.G.: An adaptive anomaly detection of web-based attacks. In: 2012 7th International Conference on Computer Science & Education (ICCSE), pp. 690–694. IEEE (2012)
Fielding, R., et al.: Hypertext transfer protocol-HTTP/1.1. Technical report (1999)
Giménez, C.T., Villegas, A.P., Marañón, G.Á.: HTTP data set CSIC 2010. Inf. Secur. Inst. CSIC (Span. Res. Nat. Coun.) (2010)
Hawkins, D.M.: Identification of Outliers. Springer, Dordrecht (1980). https://doi.org/10.1007/978-94-015-3994-4
Kruegel, C., Vigna, G.: Anomaly detection of web-based attacks. In: Proceedings of the 10th ACM Conference on Computer and Communications Security, pp. 251–261. ACM (2003)
Kruegel, C., Vigna, G., Robertson, W.: A multi-model approach to the detection of web-based attacks. Comput. Netw. 48(5), 717–738 (2005)
Le Jr, D.: An unsupervised learning approach for network and system analysis (2017)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Lei, T., Cai, R., Yang, J.M., Ke, Y., Fan, X., Zhang, L.: A pattern tree-based approach to learning URL normalization rules. In: Proceedings of the 19th International Conference on World Wide Web, pp. 611–620. ACM (2010)
Nguyen, H.T., Torrano-Gimenez, C., Alvarez, G., Petrović, S., Franke, K.: Application of the generic feature selection measure in detection of web attacks. In: Herrero, Á., Corchado, E. (eds.) CISIS 2011. LNCS, vol. 6694, pp. 25–32. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21323-6_4
Qin, Z.Q., Ma, X.K., Wang, Y.J.: Attentional payload anomaly detector for web applications. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11304, pp. 588–599. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04212-7_52
Shi, J., Cao, Y., Zhao, X.J.: Research on SEO strategies of university journal websites. In: The 2nd International Conference on Information Science and Engineering, pp. 3060–3063. IEEE (2010)
Tang, P., Qiu, W., Huang, Z., Lian, H., Liu, G.: SQL injection behavior mining based deep learning. In: Gan, G., Li, B., Li, X., Wang, S. (eds.) ADMA 2018. LNCS (LNAI), vol. 11323, pp. 445–454. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05090-0_38
Torrano-Gimenez, C., Nguyen, H.T., Alvarez, G., Franke, K.: Combining expert knowledge with automatic feature extraction for reliable web attack detection. Secur. Commun. Netw. 8(16), 2750–2767 (2015)
Wang, K., Cretu, G., Stolfo, S.J.: Anomalous payload-based worm detection and signature generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006). https://doi.org/10.1007/11663812_12
Wang, K., Stolfo, S.J.: Anomalous payload-based network intrusion detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 203–222. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30143-1_11
Yang, Y., Zhang, L., Liu, G., Chen, E.: UPCA: an efficient URL-pattern based algorithm for accurate web page classification. In: 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 1475–1480. IEEE (2015)
Yu, Y., Yan, H., Guan, H., Zhou, H.: DeepHTTP: semantics-structure model with attention for anomalous HTTP traffic detection and pattern mining. arXiv preprint arXiv:1810.12751 (2018)
Yurcik, W., Barlow, J., Rosendale, J.: Maintaining perspective on who is the enemy in the security systems administration of computer networks. In: In ACM CHI Workshop on System Administrators Are Users, p. 345. ACM Press, November 2003
Zhang, J., Zulkernine, M.: Anomaly based network intrusion detection with unsupervised outlier detection. In: 2006 IEEE International Conference on Communications, vol. 5, pp. 2388–2393. IEEE (2006)
Zhou, Y., Wang, P.: An ensemble learning approach for XSS attack detection with domain knowledge and threat intelligence. Comput. Secur. 82, 261–269 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cheng, Z., Cui, B., Fu, J. (2020). A Novel Web Anomaly Detection Approach Based on Semantic Structure. In: Xiang, Y., Liu, Z., Li, J. (eds) Security and Privacy in Social Networks and Big Data. SocialSec 2020. Communications in Computer and Information Science, vol 1298. Springer, Singapore. https://doi.org/10.1007/978-981-15-9031-3_2
Download citation
DOI: https://doi.org/10.1007/978-981-15-9031-3_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9030-6
Online ISBN: 978-981-15-9031-3
eBook Packages: Computer ScienceComputer Science (R0)