Approximate reduction of finite automata for high-speed network intrusion detection

Češka, Milan; Havlena, Vojtěch; Holík, Lukáš; Lengál, Ondřej; Vojnar, Tomáš

doi:10.1007/s10009-019-00520-8

Approximate reduction of finite automata for high-speed network intrusion detection

Regular Paper
Published: 24 May 2019

Volume 22, pages 523–539, (2020)
Cite this article

International Journal on Software Tools for Technology Transfer Aims and scope Submit manuscript

363 Accesses
8 Citations
Explore all metrics

Abstract

We consider the problem of approximate reduction of non-deterministic automata that appear in hardware-accelerated network intrusion detection systems (NIDSes). We define an error distance of a reduced automaton from the original one as the probability of packets being incorrectly classified by the reduced automaton (wrt the probabilistic distribution of packets in the network traffic). We use this notion to design an approximate reduction procedure that achieves a great size reduction (much beyond the state-of-the-art language-preserving techniques) with a controlled and small error. We have implemented our approach and evaluated it on use cases from Snort, a popular NIDS. Our results provide experimental evidence that the method can be highly efficient in practice, allowing NIDSes to follow the rapid growth in the speed of networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection

Improving the Resilience of an IDS against Performance Throttling Attacks

Fast and Scalable Regular Expressions Matching with Multi-Stride Index NFA

Notes

In theory, disambiguation can produce smaller automata, but, in our experiments, determinization proved to work better.
\({\mathcal {R}}\) is not necessarily a PA since there might be transitions in \({\mathcal {P}}\) that are either removed or copied several times in the product construction.
We use \(\log k\) to denote the base-2 logarithm of k.
We assume an implicit bijection between states of the product \({\mathcal {R}}\) and \(\{1, \ldots , |Q[{\mathcal {R}}]|\}\).
We emphasize that this does not mean that states from V will be simply removed from \({{\mathcal {A}}}\)—the performed operation depends on the particular reduction.
https://github.com/vhavlena/appreal/tree/tacas18

References

Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987). https://doi.org/10.1016/0890-5401(87)90052-6
Article MathSciNet MATH Google Scholar
Baier, C., Kiefer, S., Klein, J., Klüppelholz, S., Müller, D., Worrell, J.: Markov chains and unambiguous Büchi automata. In: CAV’16, pp. 23–42. Springer (2016)
Becchi, M., Crowley, P.: A hybrid finite automaton for practical deep packet inspection. In: CoNEXT’07, p. 1. ACM (2007)
Becchi, M., Crowley, P.: An improved algorithm to accelerate regular expression evaluation. In: ANCS’07, pp. 145–154. ACM (2007)
Becchi, M., Wiseman, C., Crowley, P.: Evaluating regular expression matching engines on network and general purpose processors. In: Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS ’09, pp. 30–39. ACM (2009)
Benedikt, M., Lenhardt, R., Worrell, J.: Model checking Markov chains against unambiguous Büchi automata. CoRR arXiv:1405.4560v2 (2016)
Bouajjani, A., Habermehl, P., Rogalewicz, A., Vojnar, T.: Abstract regular (tree) model checking. STTT 14(2), 167–191 (2012)
Article Google Scholar
Brodie, B.C., Taylor, D.E., Cytron, R.K.: A scalable architecture for high-throughput regular-expression pattern matching. In: ISCA’06, pp. 191–202. IEEE Computer Society (2006)
Bustan, D., Grumberg, O.: Simulation-based minimization. ACM Trans. Comput. Log. 4(2), 181–206 (2003)
Article MathSciNet Google Scholar
Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Proceedings of the Second International Colloquium on Grammatical Inference and Applications, ICGI ’94, pp. 139–152. Springer (1994)
Češka, M., Havlena, V., Holík, L., Lengál, O., Vojnar, T.: Approximate reduction of finite automata for high-speed network intrusion detection. In: Figshare (2018). https://doi.org/10.6084/m9.figshare.5907055
Češka, M., Havlena, V., Holík, L., Lengál, O., Vojnar, T.: Approximate reduction of finite automata for high-speed network intrusion detection. In: Proceedings of TACAS’18, LNCS, vol. 10806. Springer (2018)
Champarnaud, J., Coulon, F.: NFA reduction algorithms by means of regular inequalities. Theor. Comput. Sci. 327(3), 241–253 (2004)
Article MathSciNet Google Scholar
Clark, C.R., Schimmel, D.E.: Efficient reconfigurable logic circuits for matching complex network intrusion detection patterns. In: FPL’03, Lecture Notes in Computer Science, vol. 2778, pp. 956–959. Springer (2003)
Clemente, L.: Büchi automata can have smaller quotients. In: ICALP’11, Lecture Notes in Computer Science, vol. 6756, pp. 258–270. Springer (2011)
Csanky, L.: Fast parallel matrix inversion algorithms. In: 16th Annual Symposium on Foundations of Computer Science, pp. 11–12 (1975). https://doi.org/10.1109/SFCS.1975.14
Deza, M.M., Deza, E.: Encyclopedia of Distances. Springer, Berlin (2009)
Book Google Scholar
Etessami, K.: A hierarchy of polynomial-time computable simulations for automata. In: CONCUR 2002—Concurrency Theory, 13th International Conference, Brno, Czech Republic, August 20-23, 2002, Proceedings, Lecture Notes in Computer Science, vol. 2421, pp. 131–144. Springer (2002)
Fortune, S., Wyllie, J.: Parallelism in random access machines. In: Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pp. 114–118. ACM, New York, NY, USA (1978). https://doi.org/10.1145/800133.804339
Gange, G., Ganty, P., Stuckey, P.J.: Fixing the state budget: approximation of regular languages with small DFAs. In: ATVA’17, Lecture Notes in Computer Science, vol. 10482, pp. 67–83. Springer (2017)
Gawrychowski, P., Jez, A.: Hyper-minimisation made efficient. In: MFCS’09, Lecture Notes in Computer Science, vol. 5734, pp. 356–368. Springer (2009)
Hartmanns, A., Wendler, P.: TACAS 2018 artifact evaluation VM. In: Figshare (2018). https://doi.org/10.6084/m9.figshare.5896615
Hogben, L.: Handbook of Linear Algebra, 2nd edn. CRC Press, Boca Raton (2013)
Book Google Scholar
Hopcroft, J.E.: An N log N algorithm for minimizing states in a finite automaton. Technical report (1971)
Hutchings, B.L., Franklin, R., Carver, D.: Assisting network intrusion detection with reconfigurable hardware. In: FCCM’02, pp. 111–120. IEEE Computer Society (2002)
Jiang, T., Ravikumar, B.: Minimal NFA problems are hard. SIAM J. Comput. 22(6), 1117–1141 (1993)
Article MathSciNet Google Scholar
Kaštil, J., Kořenek, J., Lengál, O.: Methodology for fast pattern matching by deterministic finite automaton with perfect hashing. In: 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools, pp. 823–829 (2009)
Kořenek, J., Kobierský, P.: Intrusion detection system intended for multigigabit networks. In: 2007 IEEE Design and Diagnostics of Electronic Circuits and Systems, pp. 1–4 (2007)
Kumar, S., Chandrasekaran, B., Turner, J.S., Varghese, G.: Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia. In: ANCS’07, pp. 155–164. ACM (2007)
Kumar, S., Dharmapurikar, S., Yu, F., Crowley, P., Turner, J.S.: Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In: SIGCOMM’06, pp. 339–350. ACM (2006)
Kumar, S., Turner, J.S., Williams, J.: Advanced algorithms for fast and scalable deep packet inspection. In: ANCS’06, pp. 81–92. ACM (2006)
Liu, C., Wu, J.: Fast deep packet inspection with a dual finite automata. IEEE Trans. Comput. 62(2), 310–321 (2013)
Article MathSciNet Google Scholar
Luchaup, D., De Carli, L., Jha, S., Bach, E.: Deep packet inspection with DFA-trees and parametrized language overapproximation. In: INFOCOM’14, pp. 531–539. IEEE (2014)
Malcher, A.: Minimizing finite automata is computationally hard. Theor. Comput. Sci. 327(3), 375–390 (2004)
Article MathSciNet Google Scholar
Maletti, A., Quernheim, D.: Optimal hyper-minimization. CoRR arXiv:1104.3007 (2011)
Matoušek, D., Kořenek, J., Puš, V.: High-speed regular expression matching with pipelined automata. In: 2016 International Conference on Field-Programmable Technology (FPT), pp. 93–100 (2016)
Mayr, R., Clemente, L.: Advanced automata minimization. In: POPL’13, Transactions on Computer Logic, pp. 63–74. ACM (2013)
Mayr, R., et al.: Reduce: A tool for minimizing nondeterministic finite-word and Büchi automata. http://languageinclusion.org/doku.php?id=tools (2017). Accessed 30 Sept 2017
Mitra, A., Najjar, W.A., Bhuyan, L.N.: Compiling PCRE to FPGA for accelerating SNORT IDS. In: ANCS’07, pp. 127–136. ACM (2007)
Mohri, M.: A disambiguation algorithm for finite automata and functional transducers. In: CIAA’12, pp. 265–277. Springer (2012)
Mohri, M.: Edit-distance of weighted automata. In: CIAA’02, Lecture Notes in Computer Science, vol. 2608, pp. 1–23. Springer (2002)
Paige, R., Tarjan, R.E.: Three partition refinement algorithms. SIAM J. Comput. 16(6), 973–989 (1987)
Article MathSciNet Google Scholar
Papadimitriou, C.M.: Computational Complexity. Addison-Wesley, Reading (1994)
Parker, A.J., Yancey, K.B., Yancey, M.P.: Regular language distance and entropy. CoRR arXiv:1602.07715 (2016)
Puš, V., Tobola, J., Košař, V., Kaštil, J., Kořenek, J.: Netbench: framework for evaluation of packet processing algorithms. In: Symposium On Architecture For Networking And Communications Systems pp. 95–96 (2011)
Shützenberger, M.: On the definition of a family of automata. Inf. Control 4, 245–270 (1961)
Article MathSciNet Google Scholar
Sidhu, R.P.S., Prasanna, V.K.: Fast regular expression matching using FPGAs. In: FCCM’01, pp. 227–238. IEEE Computer Society (2001)
Solodovnikov, V.I.: Upper bounds on the complexity of solving systems of linear equations. J. Sov. Math. 29(4), 1482–1501 (1985)
Article MathSciNet Google Scholar
Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: ISCA’05, pp. 112–122. IEEE Computer Society (2005)
The Snort Team: Snort. http://www.snort.org. Accessed 30 Sept 2017
Thollard, F., Clark, A.: Learning stochastic deterministic regular languages. In: G. Paliouras, Y. Sakakibara (eds.) Grammatical Inference: Algorithms and Applications: 7th International Colloquium, ICGI 2004, Athens, Greece, October 11–13, 2004. Proceedings, pp. 248–259. Springer Berlin Heidelberg, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30195-0_22
Vardi, M.Y.: Automatic verification of probabilistic concurrent finite state programs. In: SFCS ’85, pp. 327–338. IEEE
Yu, F., Chen, Z., Diao, Y., Lakshman, T.V., Katz, R.H.: Fast and memory-efficient regular expression matching for deep packet inspection. In: ANCS’06, pp. 93–102. ACM (2006)

Download references

Acknowledgements

We thank Jan Kořenek, Vlastimil Košař, and Denis Matoušek for their help with translating regexes into automata and synthesis of FPGA designs, and Martin Žádník for providing us with the backbone network traffic. We thank Stefan Kiefer for helping us proving the PSPACE part of Lemma 1 and Petr Peringer for testing our artefact. We also thank the anonymous reviewers for their useful comments, which improved the quality of the paper. The work on this paper was supported by the Czech Science Foundation project 16-24707Y, the IT4IXS: IT4Innovations Excellence in Science project (LQ1602), and the FIT BUT internal project FIT-S-17-4014.

Author information

Authors and Affiliations

IT4Innovations Centre of Excellence, FIT, Brno University of Technology, Brno, Czech Republic
Milan Češka, Vojtěch Havlena, Lukáš Holík, Ondřej Lengál & Tomáš Vojnar

Authors

Milan Češka
View author publications
You can also search for this author in PubMed Google Scholar
Vojtěch Havlena
View author publications
You can also search for this author in PubMed Google Scholar
Lukáš Holík
View author publications
You can also search for this author in PubMed Google Scholar
Ondřej Lengál
View author publications
You can also search for this author in PubMed Google Scholar
Tomáš Vojnar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ondřej Lengál.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Češka, M., Havlena, V., Holík, L. et al. Approximate reduction of finite automata for high-speed network intrusion detection. Int J Softw Tools Technol Transfer 22, 523–539 (2020). https://doi.org/10.1007/s10009-019-00520-8

Download citation

Published: 24 May 2019
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10009-019-00520-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximate reduction of finite automata for high-speed network intrusion detection

Abstract

Access this article

Similar content being viewed by others

Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection

Improving the Resilience of an IDS against Performance Throttling Attacks

Fast and Scalable Regular Expressions Matching with Multi-Stride Index NFA

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Approximate reduction of finite automata for high-speed network intrusion detection

Abstract

Access this article

Similar content being viewed by others

Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection

Improving the Resilience of an IDS against Performance Throttling Attacks

Fast and Scalable Regular Expressions Matching with Multi-Stride Index NFA

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation