Skip to main content

Fast Bit-Parallel Matching for Network and Regular Expressions

  • Conference paper
String Processing and Information Retrieval (SPIRE 2010)

Abstract

In this paper, we extend the SHIFT-AND approach by Baeza-Yates and Gonnet (CACM 35(10), 1992) to the matching problem for network expressions, which are regular expressions without Kleene-closure and useful in applications such as bioinformatics and event stream processing. Following the study of Navarro (RECOMB, 2001) on the extended string matching, we introduce new operations called Scatter, Gather, and Propagate to efficiently compute ε-moves of the Thompson NFA using the Extended SHIFT-AND approach with integer addition. By using these operations and a property called the bi-monotonicity of the Thompson NFA, we present an efficient algorithm for the network expression matching that runs in O(ndm/w) time using O(dm) preprocessing and O(dm/w) space, where m and d are the length and the depth of a given network expression, n is the length of an input text, and w is the word length of the underlying computer. Furthermore, we show a modified matching algorithm for the class of regular expressions that runs in O(ndmlog(m)/w) time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, J., Diao, Y., Gyllstrom, D., Immerman, N.: Efficient pattern matching over event streams. In: Proc. ACM SIGMOD 2008, pp. 147–160 (2008)

    Google Scholar 

  2. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading (1974)

    MATH  Google Scholar 

  3. Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Communications of the ACM 35(10), 74–82 (1992)

    Article  Google Scholar 

  4. Bille, P.: New algorithms for regular expression matching. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 643–654. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Bille, P., Thorup, M.: Faster regular expression matching. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555, pp. 171–182. Springer, Heidelberg (2009)

    Google Scholar 

  6. Bille, P., Thorup, M.: Regular expression matching with multi-strings and intervals. In: Proc. SODA 2010, pp. 1297–1308 (2010)

    Google Scholar 

  7. Champarnaud, J.-M., Coulon, F., Paranthoën, T.: Compact and fast algorithms for safe regular expression search. Int. J. Comput. Math. 81(4), 383–401 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Grabowski, S., Fredriksson, K.: Bit-parallel string matching under Hamming distance in O(nm/w⌉) worst case time. Inf. Proccess. Lett. 105(5), 182–187 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  9. Kaneta, Y., Arimura, H.: Faster bit-parallel algorithm for unordered pseudo-tree matching and tree homeomorphism. In: Proc. of IWOCA 2010 (July 2010); also appeared as Hokkaido U., TCS-TR-A-10-43 (May 2010)

    Google Scholar 

  10. Kaneta, Y., Minato, S., Arimura, H.: Fast bit-parallel matching for network and regular expressions. Hokkaido U., TCS-TR-A-10-47 (August 2010)

    Google Scholar 

  11. Kaneta, Y., Yoshizawa, S., Minato, S., Arimura, H., Miyanaga, Y.: Dynamic reconfigurable bit-parallel architecture for large-scale regular expression matching. Hokkaido U., TCS-TR-A-10-45 (June 2010) (submitting)

    Google Scholar 

  12. Myers, E.W.: A four-russian algorithm for regular expression pattern matching. Journal of the ACM 39(2), 430–448 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  13. Myers, E.W.: Approximate matching of network expessions with spacers. J. Computatinal Biology 3(1), 33–51 (1996)

    Article  Google Scholar 

  14. Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with application to protein searching. In: Proc. RECOMB 2001, pp. 231–240 (2001)

    Google Scholar 

  15. Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences, Cambridge (2002)

    Google Scholar 

  16. Navarro, G., Raffinot, M.: New techniques for regular expression searching. Algorithmica 41, 89–116 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  17. Sidhu, R., Prasanna, V.K.: Fast regular expression matching using FPGAs. In: Proc. IEEE FCCM 2001, pp. 227–238 (2001)

    Google Scholar 

  18. Thompson, K.: Programming techniques: regular expression search algorithm. Communications of the ACM 11(6), 419–422 (1968)

    Article  MATH  Google Scholar 

  19. Wu, S., Manber, U.: Fast text searching: allowing errors. Communications of the ACM 35(10), 83–91 (1992)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaneta, Y., Minato, Si., Arimura, H. (2010). Fast Bit-Parallel Matching for Network and Regular Expressions. In: Chavez, E., Lonardi, S. (eds) String Processing and Information Retrieval. SPIRE 2010. Lecture Notes in Computer Science, vol 6393. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16321-0_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16321-0_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16320-3

  • Online ISBN: 978-3-642-16321-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics