The paper discusses selected issues related to the implementation and deployment of the Web Application Firewall that protects the target application by verifying the incoming requests and their parameters through matching them against recorded usage patterns. These patterns in turn are learned from the traffic generated by the users of the application. Since many web applications, including these operated by the government, are prone to exploits, there is a need to introduce new easily implementable methods of protection to prevent unauthorized access to sensitive data. A Learning Web Application Firewall offers a flexible, application-tailored, yet easy to deploy solution. There are certain concerns, however, regarding the classification of data that is used for the learning process which can, in certain cases, impair the firewall ability to classify traffic correctly. These concerns are discussed on the basis of reference implementation prepared by the authors.


web firewall learning security 


  1. 1.
    CWE/SANS Top 25 Most Dangerous Software Errors (2010),
  2. 2.
    Angluin, D., Smith, C.: Inductive Inference: Theory and Methods. ACM Computing Surveys 15(3), 237–269 (1983)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Cicchello, O., Kremer, S.C.: Inducing grammars from sparse data sets: a survey of algorithms and results. Journal of Machine Learning and Research 4, 603–632 (2003)MathSciNetCrossRefGoogle Scholar
  4. 4.
    CSI/FBI Computer Crime and Security Survey (2006),
  5. 5.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001) ISBN: 978-0-471-05669-0zbMATHGoogle Scholar
  6. 6.
    Fernau, H.: Algorithms for Learning Regular Expressions. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 297–311. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Gallagher, T., Jeffries, B., Landauer, L.: Hunting Security Bugs. Microsoft Press, Redmond (2006) ISBN: 978-0-7356-2187-9Google Scholar
  8. 8.
    Gold, E.: Complexity of automaton identification from given data. Information and Control 37(3), 302–320 (1978)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Hope, P., Walther, B.: Web Security Testing Cookbook. O’Reilly Media, Sebastopol (2008) ISBN: 978-0-596-51483-9Google Scholar
  10. 10.
    Imperva Data Security Blog: Major websites (gov,mil,edu) are Hacked and Up for Sale,
  11. 11.
    Ingham, K., et al.: Learning DFA representations of HTTP for protecting web applications. Computer Networks 51, 1239–1255 (2007)zbMATHCrossRefGoogle Scholar
  12. 12.
  13. 13.
    Kruegel, C., Vigna, G.: Anomaly detection of web-based attacks. In: Proceedings of the 10th ACM Conference on Computer and Communications Security, pp. 251–261. ACM Press, New York (2003)CrossRefGoogle Scholar
  14. 14.
    Kruegel, C., Vigna, G., Robertson, W.: A multi-model approach to the detection of web-based attacks. Computer Networks 48(5), 717–738 (2005)CrossRefGoogle Scholar
  15. 15.
    Mahoney, M.V., Chan, P.K.: Learning nonstationary models of normal network traffic for detecting novel attacks. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 376–385. ACM Press, New York (2002)CrossRefGoogle Scholar
  16. 16.
    Mahoney, M.V.: Network traffic anomaly detection based on packet bytes. In: Proceedings of the 2003 ACM Symposium on Applied Computing, pp. 346–350. ACM Press, New York (2003)CrossRefGoogle Scholar
  17. 17.
    Microsoft Security Intelligence Report, Key Findings,
  18. 18.
  19. 19.
    Oliveria, A.L., Silva, J.: Efficient search techniques for the inference of minimum sized finite automata. In: Proceedings of the Fifth String Processing and Information Retrieval Symposium, pp. 81–89. IEEE Computer Press, Los Alamitos (1998)Google Scholar
  20. 20.
    Paxson, V.: Bro: A System for Detecting Network Intruders in Real-Time. In: Proceedings of 7’th USENIX Security Symposium Lawrence Berkeley National Laboratory, San Antonio TX, January 26-29 (1998)Google Scholar
  21. 21.
    Pietro, R., Mancini, L. (eds.): Intrusion Detection Systems. Springer, Heidelberg (2008) ISBN: 978-0-387-77265-3Google Scholar
  22. 22.
    Pitt, L.: Inductive Inference, DFAs, and Computational Complexity. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 18–44. Springer, Heidelberg (1989)Google Scholar
  23. 23.
    Rabiner, L.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  24. 24.
    Roesch, M.: Snort Lightweight Intrusion Detection for Networks. In: Proceedings of 13th Systems Administration Conference, LISA 1999, pp. 229–238 (1999)Google Scholar
  25. 25.
    Sakakibara, Y.: Recent Advances of Grammatical Inference. Theor. Comput. Sci. 185(1), 15–45 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Stolcke, A., Omohundro, S.: Best-first model merging for Hidden Markov Model induction, Technical Report TR-93-003. International Computer Science Institute, Berkeley, Ca (1993)Google Scholar
  27. 27.
  28. 28.
  29. 29.
  30. 30.

Copyright information

© IFIP International Federation for Information Processing 2011

Authors and Affiliations

  • Dariusz Pałka
    • 1
  • Marek Zachara
    • 2
  1. 1.Pedagogical University of CracowPoland
  2. 2.University of Science and Technology (AGH)CracowPoland

Personalised recommendations