Fast Bit-Parallel Matching for Network and Regular Expressions

  • Yusaku Kaneta
  • Shin-ichi Minato
  • Hiroki Arimura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6393)

Abstract

In this paper, we extend the SHIFT-AND approach by Baeza-Yates and Gonnet (CACM 35(10), 1992) to the matching problem for network expressions, which are regular expressions without Kleene-closure and useful in applications such as bioinformatics and event stream processing. Following the study of Navarro (RECOMB, 2001) on the extended string matching, we introduce new operations called Scatter, Gather, and Propagate to efficiently compute ε-moves of the Thompson NFA using the Extended SHIFT-AND approach with integer addition. By using these operations and a property called the bi-monotonicity of the Thompson NFA, we present an efficient algorithm for the network expression matching that runs in O(ndm/w) time using O(dm) preprocessing and O(dm/w) space, where m and d are the length and the depth of a given network expression, n is the length of an input text, and w is the word length of the underlying computer. Furthermore, we show a modified matching algorithm for the class of regular expressions that runs in O(ndmlog(m)/w) time.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, J., Diao, Y., Gyllstrom, D., Immerman, N.: Efficient pattern matching over event streams. In: Proc. ACM SIGMOD 2008, pp. 147–160 (2008)Google Scholar
  2. 2.
    Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading (1974)MATHGoogle Scholar
  3. 3.
    Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Communications of the ACM 35(10), 74–82 (1992)CrossRefGoogle Scholar
  4. 4.
    Bille, P.: New algorithms for regular expression matching. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 643–654. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Bille, P., Thorup, M.: Faster regular expression matching. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555, pp. 171–182. Springer, Heidelberg (2009)Google Scholar
  6. 6.
    Bille, P., Thorup, M.: Regular expression matching with multi-strings and intervals. In: Proc. SODA 2010, pp. 1297–1308 (2010)Google Scholar
  7. 7.
    Champarnaud, J.-M., Coulon, F., Paranthoën, T.: Compact and fast algorithms for safe regular expression search. Int. J. Comput. Math. 81(4), 383–401 (2004)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Grabowski, S., Fredriksson, K.: Bit-parallel string matching under Hamming distance in O(nm/w⌉) worst case time. Inf. Proccess. Lett. 105(5), 182–187 (2008)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Kaneta, Y., Arimura, H.: Faster bit-parallel algorithm for unordered pseudo-tree matching and tree homeomorphism. In: Proc. of IWOCA 2010 (July 2010); also appeared as Hokkaido U., TCS-TR-A-10-43 (May 2010)Google Scholar
  10. 10.
    Kaneta, Y., Minato, S., Arimura, H.: Fast bit-parallel matching for network and regular expressions. Hokkaido U., TCS-TR-A-10-47 (August 2010)Google Scholar
  11. 11.
    Kaneta, Y., Yoshizawa, S., Minato, S., Arimura, H., Miyanaga, Y.: Dynamic reconfigurable bit-parallel architecture for large-scale regular expression matching. Hokkaido U., TCS-TR-A-10-45 (June 2010) (submitting)Google Scholar
  12. 12.
    Myers, E.W.: A four-russian algorithm for regular expression pattern matching. Journal of the ACM 39(2), 430–448 (1992)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Myers, E.W.: Approximate matching of network expessions with spacers. J. Computatinal Biology 3(1), 33–51 (1996)CrossRefGoogle Scholar
  14. 14.
    Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with application to protein searching. In: Proc. RECOMB 2001, pp. 231–240 (2001)Google Scholar
  15. 15.
    Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences, Cambridge (2002)Google Scholar
  16. 16.
    Navarro, G., Raffinot, M.: New techniques for regular expression searching. Algorithmica 41, 89–116 (2005)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Sidhu, R., Prasanna, V.K.: Fast regular expression matching using FPGAs. In: Proc. IEEE FCCM 2001, pp. 227–238 (2001)Google Scholar
  18. 18.
    Thompson, K.: Programming techniques: regular expression search algorithm. Communications of the ACM 11(6), 419–422 (1968)CrossRefMATHGoogle Scholar
  19. 19.
    Wu, S., Manber, U.: Fast text searching: allowing errors. Communications of the ACM 35(10), 83–91 (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yusaku Kaneta
    • 1
  • Shin-ichi Minato
    • 1
  • Hiroki Arimura
    • 1
  1. 1.Hokkaido UniversitySapporoJapan

Personalised recommendations