Skip to main content
Log in

Approximate reduction of finite automata for high-speed network intrusion detection

  • Regular Paper
  • Published:
International Journal on Software Tools for Technology Transfer Aims and scope Submit manuscript

Abstract

We consider the problem of approximate reduction of non-deterministic automata that appear in hardware-accelerated network intrusion detection systems (NIDSes). We define an error distance of a reduced automaton from the original one as the probability of packets being incorrectly classified by the reduced automaton (wrt the probabilistic distribution of packets in the network traffic). We use this notion to design an approximate reduction procedure that achieves a great size reduction (much beyond the state-of-the-art language-preserving techniques) with a controlled and small error. We have implemented our approach and evaluated it on use cases from Snort, a popular NIDS. Our results provide experimental evidence that the method can be highly efficient in practice, allowing NIDSes to follow the rapid growth in the speed of networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. In theory, disambiguation can produce smaller automata, but, in our experiments, determinization proved to work better.

  2. \({\mathcal {R}}\) is not necessarily a PA since there might be transitions in \({\mathcal {P}}\) that are either removed or copied several times in the product construction.

  3. We use \(\log k\) to denote the base-2 logarithm of k.

  4. We assume an implicit bijection between states of the product \({\mathcal {R}}\) and \(\{1, \ldots , |Q[{\mathcal {R}}]|\}\).

  5. We emphasize that this does not mean that states from V will be simply removed from \({{\mathcal {A}}}\)—the performed operation depends on the particular reduction.

  6. https://github.com/vhavlena/appreal/tree/tacas18

References

  1. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987). https://doi.org/10.1016/0890-5401(87)90052-6

    Article  MathSciNet  MATH  Google Scholar 

  2. Baier, C., Kiefer, S., Klein, J., Klüppelholz, S., Müller, D., Worrell, J.: Markov chains and unambiguous Büchi automata. In: CAV’16, pp. 23–42. Springer (2016)

  3. Becchi, M., Crowley, P.: A hybrid finite automaton for practical deep packet inspection. In: CoNEXT’07, p. 1. ACM (2007)

  4. Becchi, M., Crowley, P.: An improved algorithm to accelerate regular expression evaluation. In: ANCS’07, pp. 145–154. ACM (2007)

  5. Becchi, M., Wiseman, C., Crowley, P.: Evaluating regular expression matching engines on network and general purpose processors. In: Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS ’09, pp. 30–39. ACM (2009)

  6. Benedikt, M., Lenhardt, R., Worrell, J.: Model checking Markov chains against unambiguous Büchi automata. CoRR arXiv:1405.4560v2 (2016)

  7. Bouajjani, A., Habermehl, P., Rogalewicz, A., Vojnar, T.: Abstract regular (tree) model checking. STTT 14(2), 167–191 (2012)

    Article  Google Scholar 

  8. Brodie, B.C., Taylor, D.E., Cytron, R.K.: A scalable architecture for high-throughput regular-expression pattern matching. In: ISCA’06, pp. 191–202. IEEE Computer Society (2006)

  9. Bustan, D., Grumberg, O.: Simulation-based minimization. ACM Trans. Comput. Log. 4(2), 181–206 (2003)

    Article  MathSciNet  Google Scholar 

  10. Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Proceedings of the Second International Colloquium on Grammatical Inference and Applications, ICGI ’94, pp. 139–152. Springer (1994)

  11. Češka, M., Havlena, V., Holík, L., Lengál, O., Vojnar, T.: Approximate reduction of finite automata for high-speed network intrusion detection. In: Figshare (2018). https://doi.org/10.6084/m9.figshare.5907055

  12. Češka, M., Havlena, V., Holík, L., Lengál, O., Vojnar, T.: Approximate reduction of finite automata for high-speed network intrusion detection. In: Proceedings of TACAS’18, LNCS, vol. 10806. Springer (2018)

  13. Champarnaud, J., Coulon, F.: NFA reduction algorithms by means of regular inequalities. Theor. Comput. Sci. 327(3), 241–253 (2004)

    Article  MathSciNet  Google Scholar 

  14. Clark, C.R., Schimmel, D.E.: Efficient reconfigurable logic circuits for matching complex network intrusion detection patterns. In: FPL’03, Lecture Notes in Computer Science, vol. 2778, pp. 956–959. Springer (2003)

  15. Clemente, L.: Büchi automata can have smaller quotients. In: ICALP’11, Lecture Notes in Computer Science, vol. 6756, pp. 258–270. Springer (2011)

  16. Csanky, L.: Fast parallel matrix inversion algorithms. In: 16th Annual Symposium on Foundations of Computer Science, pp. 11–12 (1975). https://doi.org/10.1109/SFCS.1975.14

  17. Deza, M.M., Deza, E.: Encyclopedia of Distances. Springer, Berlin (2009)

    Book  Google Scholar 

  18. Etessami, K.: A hierarchy of polynomial-time computable simulations for automata. In: CONCUR 2002—Concurrency Theory, 13th International Conference, Brno, Czech Republic, August 20-23, 2002, Proceedings, Lecture Notes in Computer Science, vol. 2421, pp. 131–144. Springer (2002)

  19. Fortune, S., Wyllie, J.: Parallelism in random access machines. In: Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pp. 114–118. ACM, New York, NY, USA (1978). https://doi.org/10.1145/800133.804339

  20. Gange, G., Ganty, P., Stuckey, P.J.: Fixing the state budget: approximation of regular languages with small DFAs. In: ATVA’17, Lecture Notes in Computer Science, vol. 10482, pp. 67–83. Springer (2017)

  21. Gawrychowski, P., Jez, A.: Hyper-minimisation made efficient. In: MFCS’09, Lecture Notes in Computer Science, vol. 5734, pp. 356–368. Springer (2009)

  22. Hartmanns, A., Wendler, P.: TACAS 2018 artifact evaluation VM. In: Figshare (2018). https://doi.org/10.6084/m9.figshare.5896615

  23. Hogben, L.: Handbook of Linear Algebra, 2nd edn. CRC Press, Boca Raton (2013)

    Book  Google Scholar 

  24. Hopcroft, J.E.: An N log N algorithm for minimizing states in a finite automaton. Technical report (1971)

  25. Hutchings, B.L., Franklin, R., Carver, D.: Assisting network intrusion detection with reconfigurable hardware. In: FCCM’02, pp. 111–120. IEEE Computer Society (2002)

  26. Jiang, T., Ravikumar, B.: Minimal NFA problems are hard. SIAM J. Comput. 22(6), 1117–1141 (1993)

    Article  MathSciNet  Google Scholar 

  27. Kaštil, J., Kořenek, J., Lengál, O.: Methodology for fast pattern matching by deterministic finite automaton with perfect hashing. In: 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools, pp. 823–829 (2009)

  28. Kořenek, J., Kobierský, P.: Intrusion detection system intended for multigigabit networks. In: 2007 IEEE Design and Diagnostics of Electronic Circuits and Systems, pp. 1–4 (2007)

  29. Kumar, S., Chandrasekaran, B., Turner, J.S., Varghese, G.: Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia. In: ANCS’07, pp. 155–164. ACM (2007)

  30. Kumar, S., Dharmapurikar, S., Yu, F., Crowley, P., Turner, J.S.: Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In: SIGCOMM’06, pp. 339–350. ACM (2006)

  31. Kumar, S., Turner, J.S., Williams, J.: Advanced algorithms for fast and scalable deep packet inspection. In: ANCS’06, pp. 81–92. ACM (2006)

  32. Liu, C., Wu, J.: Fast deep packet inspection with a dual finite automata. IEEE Trans. Comput. 62(2), 310–321 (2013)

    Article  MathSciNet  Google Scholar 

  33. Luchaup, D., De Carli, L., Jha, S., Bach, E.: Deep packet inspection with DFA-trees and parametrized language overapproximation. In: INFOCOM’14, pp. 531–539. IEEE (2014)

  34. Malcher, A.: Minimizing finite automata is computationally hard. Theor. Comput. Sci. 327(3), 375–390 (2004)

    Article  MathSciNet  Google Scholar 

  35. Maletti, A., Quernheim, D.: Optimal hyper-minimization. CoRR arXiv:1104.3007 (2011)

  36. Matoušek, D., Kořenek, J., Puš, V.: High-speed regular expression matching with pipelined automata. In: 2016 International Conference on Field-Programmable Technology (FPT), pp. 93–100 (2016)

  37. Mayr, R., Clemente, L.: Advanced automata minimization. In: POPL’13, Transactions on Computer Logic, pp. 63–74. ACM (2013)

  38. Mayr, R., et al.: Reduce: A tool for minimizing nondeterministic finite-word and Büchi automata. http://languageinclusion.org/doku.php?id=tools (2017). Accessed 30 Sept 2017

  39. Mitra, A., Najjar, W.A., Bhuyan, L.N.: Compiling PCRE to FPGA for accelerating SNORT IDS. In: ANCS’07, pp. 127–136. ACM (2007)

  40. Mohri, M.: A disambiguation algorithm for finite automata and functional transducers. In: CIAA’12, pp. 265–277. Springer (2012)

  41. Mohri, M.: Edit-distance of weighted automata. In: CIAA’02, Lecture Notes in Computer Science, vol. 2608, pp. 1–23. Springer (2002)

  42. Paige, R., Tarjan, R.E.: Three partition refinement algorithms. SIAM J. Comput. 16(6), 973–989 (1987)

    Article  MathSciNet  Google Scholar 

  43. Papadimitriou, C.M.: Computational Complexity. Addison-Wesley, Reading (1994)

  44. Parker, A.J., Yancey, K.B., Yancey, M.P.: Regular language distance and entropy. CoRR arXiv:1602.07715 (2016)

  45. Puš, V., Tobola, J., Košař, V., Kaštil, J., Kořenek, J.: Netbench: framework for evaluation of packet processing algorithms. In: Symposium On Architecture For Networking And Communications Systems pp. 95–96 (2011)

  46. Shützenberger, M.: On the definition of a family of automata. Inf. Control 4, 245–270 (1961)

    Article  MathSciNet  Google Scholar 

  47. Sidhu, R.P.S., Prasanna, V.K.: Fast regular expression matching using FPGAs. In: FCCM’01, pp. 227–238. IEEE Computer Society (2001)

  48. Solodovnikov, V.I.: Upper bounds on the complexity of solving systems of linear equations. J. Sov. Math. 29(4), 1482–1501 (1985)

    Article  MathSciNet  Google Scholar 

  49. Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: ISCA’05, pp. 112–122. IEEE Computer Society (2005)

  50. The Snort Team: Snort. http://www.snort.org. Accessed 30 Sept 2017

  51. Thollard, F., Clark, A.: Learning stochastic deterministic regular languages. In: G. Paliouras, Y. Sakakibara (eds.) Grammatical Inference: Algorithms and Applications: 7th International Colloquium, ICGI 2004, Athens, Greece, October 11–13, 2004. Proceedings, pp. 248–259. Springer Berlin Heidelberg, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30195-0_22

  52. Vardi, M.Y.: Automatic verification of probabilistic concurrent finite state programs. In: SFCS ’85, pp. 327–338. IEEE

  53. Yu, F., Chen, Z., Diao, Y., Lakshman, T.V., Katz, R.H.: Fast and memory-efficient regular expression matching for deep packet inspection. In: ANCS’06, pp. 93–102. ACM (2006)

Download references

Acknowledgements

We thank Jan Kořenek, Vlastimil Košař, and Denis Matoušek for their help with translating regexes into automata and synthesis of FPGA designs, and Martin Žádník for providing us with the backbone network traffic. We thank Stefan Kiefer for helping us proving the PSPACE part of Lemma 1 and Petr Peringer for testing our artefact. We also thank the anonymous reviewers for their useful comments, which improved the quality of the paper. The work on this paper was supported by the Czech Science Foundation project 16-24707Y, the IT4IXS: IT4Innovations Excellence in Science project (LQ1602), and the FIT BUT internal project FIT-S-17-4014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ondřej Lengál.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Češka, M., Havlena, V., Holík, L. et al. Approximate reduction of finite automata for high-speed network intrusion detection. Int J Softw Tools Technol Transfer 22, 523–539 (2020). https://doi.org/10.1007/s10009-019-00520-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10009-019-00520-8

Keywords

Navigation