Specification-Based Intrusion Detection Using Sequence Alignment and Data Clustering

  • Djibrilla Amadou KountchéEmail author
  • Sylvain Gombault
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 523)


In this paper, we present our work on specification-based intrusion detection. Our goal is to build a web application firewall which is able to learn the normal behaviour of an application (and/or the user) from the traffic between a client and a server. The model learnt is used to validate future traffic. We will discuss later in this paper, the interactions between the learning phase and the exploitation phase of the generated model expressed as a set of regular expressions. These regular expressions are generated after a process of sequence alignment combined to BRELA (Basic Regular Expression Learning Algorithm) or directly by the later. We also present our multiple sequence alignment algorithm called AMAA (Another multiple Alignment Algorithm) and the usage of data clustering to improve the generated regular expressions. The detection phase is simulated in this paper by generating data which represent a traffic and using a pattern matcher to validate them.


Positive security Sequence alignment Data clustering Web application firewall Specification-based ids 



This work is a part of the RoCaWeb project carried at Kereval and Telecom-Bretagne and financed as a RAPID project by the DGA-MI. We would like to thank Alain Ribault, Constant Chartier, Fr?d?ric Majorczyk and Yacine Tamoudi.


  1. 1.
    Adams, N., Heard, N.: Data Analysis for Network Cyber-Security. World Scientific, Singapore (2014)CrossRefGoogle Scholar
  2. 2.
    Bartoli, A., Davanzo, G., De Lorenzo, A., Mauri, M., Medvet, E., Sorio, E.: Automatic generation of regular expressions from examples with genetic programming. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 1477–1478. ACM (2012)Google Scholar
  3. 3.
    Böckenhauer, H.J., Bongartz, D.: Algorithmic Aspects of Bioinformatics. Natural Computing Series. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  4. 4.
    De La Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognit. 38(9), 1332–1348 (2005)CrossRefGoogle Scholar
  5. 5.
    Debar, H., Dacier, M., Wespi, A.: Towards a taxonomy of intrusion-detection systems. Comput. Netw. 31(8), 805–822 (1999)CrossRefGoogle Scholar
  6. 6.
    Fernau, H.: Algorithms for learning regular expressions from positive data. Inf. Comput. 207(4), 521–541 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    Garcia-Teodoro, P.: Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009)CrossRefGoogle Scholar
  8. 8.
    Jokar, P., Nicanfar, H., Leung, V.C.M.: Specification-based intrusion detection for home area networks in smart grids. In: 2011 IEEE International Conference on Smart Grid Communications (SmartGridComm), pp. 208–213. IEEE (2011)Google Scholar
  9. 9.
    Kruegel, C., Vigna, G., Robertson, W.: A multi-model approach to the detection of web-based attacks. Comput. Netw. 48(5), 717–738 (2005)CrossRefGoogle Scholar
  10. 10.
    Li, Y., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., Jagadish, H.V.: Regular expression learning for information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 21–30. Association for Computational Linguistics (2008)Google Scholar
  11. 11.
    Li, Z., Sanghi, M., Chen, Y., Kao, M.-Y., Chavez, B.: Hamsa: Fast signature generation for zero-day polymorphic worms with provable attack resilience. In: 2006 IEEE Symposium on Security and Privacy, 15 p. IEEE (2006)Google Scholar
  12. 12.
    Mouelhi, T.: Testing and Modeling Security Mechanisms in Web Applications. Theses, Institut National des Télécommunications (2010)Google Scholar
  13. 13.
    Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: 2005 IEEE Symposium on Security and Privacy, pp. 226–241. IEEE (2005)Google Scholar
  14. 14.
    Notredame, C., Higgins, D.G., Heringa, J.: T-coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302(1), 205–217 (2000)CrossRefGoogle Scholar
  15. 15.
    Saltzer, J.H., Schroeder, M.D.: The protection of information in computer systems. Proc. IEEE 63(9), 1278–1308 (1975)CrossRefGoogle Scholar
  16. 16.
    Scarfone, K., Mell, P.: Guide to intrusion detection and prevention systems (idps). NIST Spec. Publ. 800(2007), 94 (2007)Google Scholar
  17. 17.
    Tang, Y., Lu, X., Xiao, B.: Generating simplified regular expression signatures for polymorphic worms. In: Xiao, B., Yang, L.T., Ma, J., Muller-Schloer, C., Hua, Y. (eds.) ATC 2007. LNCS, vol. 4610, pp. 478–488. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  18. 18.
    Tang, Y., Xiao, B., Xicheng, L.: Using a bioinformatics approach to generate accurate exploit-based signatures for polymorphic worms. Comput. Secur. 28(8), 827–842 (2009)CrossRefGoogle Scholar
  19. 19.
    Uppuluri, P., Sekar, R.: Experiences with specification-based intrusion detection. In: Lee, W., Mé, L., Wespi, A. (eds.) RAID 2001. LNCS, vol. 2212, p. 172. Springer, Heidelberg (2001) CrossRefGoogle Scholar
  20. 20.
    Vigna, G., Valeur, F., Kemmerer, R.A.: Designing and implementing a family of intrusion detection systems. In: ACM SIGSOFT Software Engineering Notes, vol. 28, pp. 88–97. ACM (2003)Google Scholar
  21. 21.
    Ye, N., Li, X., Chen, Q., Emran, S.M., Xu, M.: Probabilistic techniques for intrusion detection based on computer audit data. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 31(4), 266–274 (2001)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Institut Mines-Télécom; Télécom Bretagne; IRISA/D2/OCIF RSMUniversité Européenne de BretagneRennesFrance

Personalised recommendations