Web Application Attacks Detection Using Deep Learning

Montes, Nicolás; Betarte, Gustavo; Martínez, Rodrigo; Pardo, Alvaro

doi:10.1007/978-3-030-93420-0_22

Nicolás Montes¹³,
Gustavo Betarte¹¹,
Rodrigo Martínez¹¹ &
…
Alvaro Pardo¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12702))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

899 Accesses
3 Citations

Abstract

This work investigates the use of deep learning techniques to improve the performance of web application firewalls (WAFs), systems that are used to detect and prevent attacks to web applications. Typically, a waf inspects the http requests that are exchanged between client and server to spot attacks and block potential threats. We model the problem as a one-class supervised case and build a feature extractor using deep learning techniques. We treat the http requests as text and train a deep language model with a transformer encoder architecture which is a self-attention based neural network. The use of pre-trained language models has yielded significant improvements on a diverse set of NLP tasks because they are capable of doing transfer learning. We use the pre-trained model as a feature extractor to map a http request into a feature vector. These vectors are then used to train a one-class classifier. We also use a performance metric to automatically define an operational point for the one-class model. The experimental results show that the proposed approach outperforms the ones of the classic rule-based ModSecurity configured with a vanilla owasp crs and does not require the participation of a security expert to define the features.

This research was partially supported by a grant given to Nicolás Montes from ANII (http://anii.org.uy) and was done in the context of projects FMV_1_2017_136337 (Fondo María Viñas, ANII) and WAFINTL from ICT4V center (http://ict4v.org).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We could have chosen another pre-trained BPE tokenizer instead of the one proposed in [19]. The key point is to use a BPE tokenizer trained on huge corpus (40 GB of text) because they can tokenize any word (and any character) of any language without using the unknown token.

References

The Illustrated Transformer - Jay Alammar - Visualizing machine learning one concept at a time. jalammar.github.io/illustrated-transformer/. Accessed 14 Feb 2021
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Google Scholar
Betarte, G., Giménez, E., Martinez, R., Pardo, Á.: Improving web application firewalls through anomaly detection. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 779–784. IEEE (2018)
Google Scholar
Betarte, G., Martínez, R., Pardo, Á.: Web application attacks detection using machine learning techniques. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1065–1072. IEEE (2018)
Google Scholar
Corona, I., Ariu, D., Giacinto, G.: Hmm-web: a framework for the detection of attacks against web applications. In: Proceedings of ICC 2009, pp. 1–6 (2009)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Ethayarajh, K.: How contextual are contextualized word representations? comparing the geometry of bert, elmo, and gpt-2 embeddings. arXiv preprint arXiv:1909.00512 (2019)
Folini, C.: Handling false positives with the owasp modsecurity core rule set (2016)
Google Scholar
Hacker, A.J.: Importance of web application firewall technology for protecting web-based resources. ICSA Labs an Independent Verizon Business (2008)
Google Scholar
Kruegel, C., Vigna, G.: Anomaly detection of web-based attacks. In: Proceedings of CCS 2003, pp. 251–261. ACM (2003)
Google Scholar
Lee, W.S., Liu, B.: Learning with positive and unlabeled examples using weighted logistic regression. In: ICML, vol. 3, pp. 448–455 (2003)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Martínez, R.: Enhancing web application attack detection using machine learning. Master thesis, Facultad de Ingeniería, UdelaR - Área Informática del Pedeciba, Uruguay (2019)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
OWASP. Owasp modsecurity core rule set project. coreruleset.org. Accessed 14 Feb 2021
OWASP. Owasp top ten project. https://www.owasp.org/index.php/Category:OWASP/Top/Ten/Project. Accessed 14 Feb 2021
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Qin, Z.Q., Ma, X.K., Wang, Y.J.: Attentional payload anomaly detector for web applications. In: Cheng, L., Leung, A., Ozawa, S. (eds.) Neural Information Processing. ICONIP 2018. LNCS, vol. 11304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04212-7_52
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Google Scholar
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Sureda Riera, T., Bermejo Higuera, J.-R., Bermejo Higuera, J., Martínez Herraiz, J.-J., Sicilia Montalvo, J.-A.: Prevention and fighting against web attacks through anomaly detection technology. A systematic review. Sustainability, 12(12) (2020)
Google Scholar
Torrano-Gimenez, C., Perez-Villegas, A., Marañón, G.Á., et al.: An anomaly-based approach for intrusion detection in web traffic. J. Inf. Assurance Secur. 5(4), 446–454 (2010)
Google Scholar
Trustwave Holdings, I.: Modsecurity: open source web application firewall
Google Scholar
Vartouni, A.M., Teshnehlab, M., Kashi, S.S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Inf. Secur. 13(4), 352–361 (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Yu, Y., Yan, H., Guan, H., Zhou, H.: Deephttp: semantics-structure model with attention for anomalous http traffic detection and pattern mining. arXiv preprint arXiv:1810.12751 (2018)
Yuan, G., Li, B., Yao, Y., Zhang, S.: A deep learning enabled subspace spectral ensemble clustering approach for web anomaly detection. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3896–3903. IEEE (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Computación, Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Gustavo Betarte & Rodrigo Martínez
Departamento de Ingeniería, Facultad de Ingeniería y Tecnologías, Universidad Católica del Uruguay, Montevideo, Uruguay
Alvaro Pardo
Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Nicolás Montes

Authors

Nicolás Montes
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Betarte
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro Pardo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Gustavo Betarte , Rodrigo Martínez or Alvaro Pardo .

Editor information

Editors and Affiliations

Universidade do Porto, Porto, Portugal
João Manuel R. S. Tavares
Universidade Estadual Paulista, São Paulo, Brazil
João Paulo Papa
University of the Balearic Islands, Palma de Mallorca, Spain
Manuel González Hidalgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Montes, N., Betarte, G., Martínez, R., Pardo, A. (2021). Web Application Attacks Detection Using Deep Learning. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-93420-0_22
Published: 13 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)