Advertisement

Compressed Automata for Dictionary Matching

  • Tomohiro I
  • Takaaki Nishimoto
  • Shunsuke Inenaga
  • Hideo Bannai
  • Masayuki Takeda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7982)

Abstract

A variant of the dictionary matching problem is addressed where the dictionary is given in an SLP-compressed form. An Aho-Corasick automata-based algorithm is presented which pre-processes the compressed dictionary \(\mathcal{D}\) in O(n 4logn) time using O(n 2logN) space and recognizes all occurrences of the patterns in \(\mathcal{D}\) in amortized O(h + m) running time per character, where n and N are, respectively, the compressed and uncompressed sizes of \(\mathcal{D}\), and h is the height of \(\mathcal{D}\), and m is the number of patterns in the dictionary.

Keywords

Arithmetic Progression Derivation Tree Failure Function Implicit State Cyclic Part 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aho, A.V., Corasick, M.: Efficient string matching: An aid to bibliographic search. Comm. ACM 18(6), 333–340 (1975)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Belazzougui, D.: Succinct dictionary matching with no slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings. In: Proc. SODA 2011, pp. 373–389 (2011)Google Scholar
  4. 4.
    Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, New York (1994)zbMATHGoogle Scholar
  5. 5.
    Gąsieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient algorithms for Lempel-Ziv encoding. In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  6. 6.
    Karpinski, M., Rytter, W., Shinohara, A.: An efficient pattern-matching algorithm for strings with short descriptions. Nordic Journal of Computing 4, 172–186 (1997)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Kida, T., Shibata, Y., Takeda, M., Shinohara, A., Arikawa, S.: Collage system: A unifying framework for compressed pattern matching. Theor. Comput. Sci. 298(1), 253–272 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Larsson, N.J., Moffat, A.: Offline dictionary-based compression. In: Proc. DCC 1999, pp. 296–305. IEEE Computer Society (1999)Google Scholar
  9. 9.
    Miyazaki, M., Shinohara, A., Takeda, M.: An improved pattern matching algorithm for strings in terms of straight-line programs. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 1–11. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  10. 10.
    Nevill-Manning, C.G., Witten, I.H., Maulsby, D.L.: Compression by induction of hierarchical grammars. In: Proc. DCC 1994, pp. 244–253 (1994)Google Scholar
  11. 11.
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Storer, J., Szymanski, T.: Data compression via textual substitution. J. ACM 29(4), 928–951 (1982)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Weiner, P.: Linear pattern-matching algorithms. In: Proc. of 14th IEEE Ann. Symp. on Switching and Automata Theory, pp. 1–11. Institute of Electrical Electronics Engineers, New York (1973)CrossRefGoogle Scholar
  14. 14.
    Welch, T.A.: A technique for high performance data compression. IEEE Computer 17, 8–19 (1984)CrossRefGoogle Scholar
  15. 15.
    Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory IT-23(3), 337–349 (1977)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Transactions on Information Theory 24(5), 530–536 (1978)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Tomohiro I
    • 1
    • 2
  • Takaaki Nishimoto
    • 1
  • Shunsuke Inenaga
    • 1
  • Hideo Bannai
    • 1
  • Masayuki Takeda
    • 1
  1. 1.Department of InformaticsKyushu UniversityJapan
  2. 2.Japan Society for the Promotion of Science (JSPS)Japan

Personalised recommendations