Compressed Automata for Dictionary Matching

  • Tomohiro I
  • Takaaki Nishimoto
  • Shunsuke Inenaga
  • Hideo Bannai
  • Masayuki Takeda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7982)

Abstract

A variant of the dictionary matching problem is addressed where the dictionary is given in an SLP-compressed form. An Aho-Corasick automata-based algorithm is presented which pre-processes the compressed dictionary \(\mathcal{D}\) in O(n 4logn) time using O(n 2logN) space and recognizes all occurrences of the patterns in \(\mathcal{D}\) in amortized O(h + m) running time per character, where n and N are, respectively, the compressed and uncompressed sizes of \(\mathcal{D}\), and h is the height of \(\mathcal{D}\), and m is the number of patterns in the dictionary.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aho, A.V., Corasick, M.: Efficient string matching: An aid to bibliographic search. Comm. ACM 18(6), 333–340 (1975)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Belazzougui, D.: Succinct dictionary matching with no slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings. In: Proc. SODA 2011, pp. 373–389 (2011)Google Scholar
  4. 4.
    Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, New York (1994)MATHGoogle Scholar
  5. 5.
    Gąsieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient algorithms for Lempel-Ziv encoding. In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  6. 6.
    Karpinski, M., Rytter, W., Shinohara, A.: An efficient pattern-matching algorithm for strings with short descriptions. Nordic Journal of Computing 4, 172–186 (1997)MathSciNetMATHGoogle Scholar
  7. 7.
    Kida, T., Shibata, Y., Takeda, M., Shinohara, A., Arikawa, S.: Collage system: A unifying framework for compressed pattern matching. Theor. Comput. Sci. 298(1), 253–272 (2003)MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Larsson, N.J., Moffat, A.: Offline dictionary-based compression. In: Proc. DCC 1999, pp. 296–305. IEEE Computer Society (1999)Google Scholar
  9. 9.
    Miyazaki, M., Shinohara, A., Takeda, M.: An improved pattern matching algorithm for strings in terms of straight-line programs. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 1–11. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  10. 10.
    Nevill-Manning, C.G., Witten, I.H., Maulsby, D.L.: Compression by induction of hierarchical grammars. In: Proc. DCC 1994, pp. 244–253 (1994)Google Scholar
  11. 11.
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)MathSciNetMATHCrossRefGoogle Scholar
  12. 12.
    Storer, J., Szymanski, T.: Data compression via textual substitution. J. ACM 29(4), 928–951 (1982)MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Weiner, P.: Linear pattern-matching algorithms. In: Proc. of 14th IEEE Ann. Symp. on Switching and Automata Theory, pp. 1–11. Institute of Electrical Electronics Engineers, New York (1973)CrossRefGoogle Scholar
  14. 14.
    Welch, T.A.: A technique for high performance data compression. IEEE Computer 17, 8–19 (1984)CrossRefGoogle Scholar
  15. 15.
    Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory IT-23(3), 337–349 (1977)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Transactions on Information Theory 24(5), 530–536 (1978)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Tomohiro I
    • 1
    • 2
  • Takaaki Nishimoto
    • 1
  • Shunsuke Inenaga
    • 1
  • Hideo Bannai
    • 1
  • Masayuki Takeda
    • 1
  1. 1.Department of InformaticsKyushu UniversityJapan
  2. 2.Japan Society for the Promotion of Science (JSPS)Japan

Personalised recommendations