Abstract
Regular expression (RegEx) matching has been widely used in various networking and security applications. Despite much effort on this important problem, it remains a fundamentally difficult problem. DFA-based solutions can achieve high throughput, but require too much memory to be executed in high speed SRAM. NFA-based solutions require small memory, but are too slow. In this paper, we propose RegexFilter, a prefiltering approach. The basic idea is to generate the RegEx print of RegEx set and use it to prefilter out most unmatched items. There are two key technical challenges: the generation of RegEx print and the matching process of RegEx print. The generation of RegEx is tricky as we need to tradeoff between two conflicting goals: filtering effectiveness, which means that we want the RegEx print to filter out as many unmatched items as possible, and matching speed, which means that we want the matching speed of the RegEx print as high as possible. To address the first challenge, we propose some measurement tools for RegEx complexity and filtering effectiveness, and use it to guide the generation of RegEx print. To address the second challenge, we propose a fast RegEx print matching solution using Ternary Content Addressable Memory. We implemented our approach and conducted experiments on real world data sets. Our experimental results show that RegexFilter can speedup the potential throughput of RegEx matching by 21.5 times and 20.3 times for RegEx sets of Snort and L7-Filter systems, at the cost of less than 0.2 Mb TCAM chip.
Chapter PDF
References
Kojm, T.: Clam Anti-virus Signature Database, http://www.clamav.net
Roesch, M.: Snort - Lightweight Intrusion Detection for Networks. In: Proc. USENIX LISA, pp. 229–238 (1999)
Levandoski, J., Sommer, E., Strait, M.: Application Layer Packet Classifier for Linux, http://l7-filter.sourceforge.net/
Yu, F., Chen, Z., Diao, Y., Lakshman, T.V., Katz, R.H.: Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection. In: Proc. ACM/IEEE ANCS, pp. 93–102 (2006)
Kandhan, R., Teletia, N., Patel, J.M.: SigMatch: Fast and Scalable Multi-Pattern Matching. Proceedings of the VLDB Endowment 3(1-12), 1173–1184 (2010)
Kumar, S., Dharmapurikar, S., Yu, F., Crowley, P., Turner, J.: Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection. In: Proc. ACM SIGCOMM, pp. 339–350 (2006)
Ficara, D., Giordano, S., Procissi, G., Vitucci, F., Antichi, G., Di Pietro, A.: An Improved DFA for Fast Regular Expression Matching. ACM SIGCOMM Computer Communication Review 38(5), 29–40 (2008)
Liu, T., Yang, Y., Liu, Y., Sun, Y., Guo, L.: An Efficient Regular Expressions Compression Algorithm From A New Perspective. In: Proc. IEEE INFOCOM, pp. 2129–2137 (2011)
Becchi, M., Crowley, P.: A Hybrid Finite Automaton for Practical Deep Packet Inspection. In: Proc. ACM CoNEXT Conference, pp. 1–12 (2007)
Smith, R., Estan, C., Jha, S., Kong, S.: Deflating the Big Bang: Fast and Scalable Deep Packet Inspection with Extended Finite Automata. ACM SIGCOMM Computer Communication Review 38(4), 207–218 (2008)
Kumar, S., Chandrasekaran, B., Turner, J., Varghese, G.: Curing Regular Expressions Matching Algorithms from Insomnia, Amnesia, and Acalculia. In: Proc. ACM/IEEE ANCS, pp. 155–164 (2007)
Meiners, C.R., Patel, J., Norige, E., Torng, E., Liu, A.X.: Fast Regular Expression Matching using Small TCAMs for Network Intrusion Detection and Prevention Systems. In: Proc. USENIX Security Symposium, p. 8 (2010)
Watson, B.W.: A New Regular Grammar Pattern Matching Algorithm. In: Díaz, J. (ed.) ESA 1996. LNCS, vol. 1136, pp. 364–377. Springer, Heidelberg (1996)
Cho, J., Rajagopalan, S.: A Fast Regular Expression Indexing Engine. In: Proceedings of the 18th International Conference on Data Engineering, pp. 419–430 (2002)
Yang, C.C., CHENG, C.M., WANG, S.D.: Two-phase Pattern Matching for Regular Expressions in Intrusion Detection Systems. Journal of Information Science and Engineering 26, 1563–1582 (2010)
Ramaswamy, R., Kencl, L., Iannaccone, G.: Approximate Fingerprinting to Accelerate Pattern Matching. In: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, pp. 301–306 (2006)
Broder, A., Mitzenmacher, M., Mitzenmacher, A.: Network Applications of Bloom Filters: A Survey. In: Internet Mathematics. Citeseer (2002)
Ficara, D., Antichi, G., Pietro, A.D., Giordano, S., Procissi, G., Vitucci, F.: Sampling Techniques to Accelerate Pattern Matching in Network Intrusion Detection Systems. In: Proc. IEEE ICC, pp. 1–5. IEEE (2010)
Tang, Y., Jiang, J., Hu, C., Liu, B.: Managing DFA History with Queue for Deflation DFA. Journal of Network and Systems Management, 1–26 (2011)
Becchi, M.: Regular Expression Processor, http://regex.wustl.edu/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, T., Sun, Y., Liu, A.X., Guo, L., Fang, B. (2012). A Prefiltering Approach to Regular Expression Matching for Network Security Systems. In: Bao, F., Samarati, P., Zhou, J. (eds) Applied Cryptography and Network Security. ACNS 2012. Lecture Notes in Computer Science, vol 7341. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31284-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-31284-7_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31283-0
Online ISBN: 978-3-642-31284-7
eBook Packages: Computer ScienceComputer Science (R0)