Advertisement

LZ-ABT: A Practical Algorithm for \(\alpha \)-Balanced Grammar Compression

  • Tatsuya Ohno
  • Keisuke Goto
  • Yoshimasa Takabatake
  • Tomohiro I
  • Hiroshi Sakamoto
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10979)

Abstract

We propose a new LZ78-style grammar compression algorithm, named LZ-ABT, which is a simple online algorithm to create, given a string of length N over an alphabet of size \(\sigma \), an \(\alpha \)-balanced grammar in \(O(N \log N \log \sigma )\) time and O(n) space in addition to the input string, where n is the grammar size to output. LZ-ABT can avoid the lower-bound of \(\varOmega (N^{5/4})\) time of the naive algorithms for LZMW and LZD, other LZ78-style compression algorithms, which was observed in [Badkobeh et al. SPIRE 2017, pp. 51–67]. We also show that the algorithm can be executed in compressed space, i.e., without storing the whole input string explicitly in memory: in \(O(N \log ^2 N \log \sigma )\) time and O(n) space, or \(O(N \log N \log \sigma )\) time and \(O(n \log ^{*} N)\) space. We implement LZ-ABT running in \(O(N \log N \log \sigma )\) time and O(N) space and empirically show that its performance is competitive to LZD. This is the first practical implementation of \(\alpha \)-balanced grammar compression to the best of our knowledge.

Notes

Acknowledgments

This work was supported by JST CREST (Grant Number JPMJCR1402), and KAKENHI (Grant Numbers 18K18111, 17H01791 and 16K16009).

References

  1. 1.
    Badkobeh, G., Gagie, T., Inenaga, S., Kociumaka, T., Kosolobov, D., Puglisi, S.J.: On two LZ78-style grammars: compression bounds and compressed-space computation. In: Fici, G., Sciortino, M., Venturini, R. (eds.) SPIRE 2017. LNCS, vol. 10508, pp. 51–67. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-67428-5_5CrossRefGoogle Scholar
  2. 2.
    Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Gagie, T., Gawrychowski, P., Kärkkäinen, J., Nekrich, Y., Puglisi, S.J.: A faster grammar-based self-index. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 240–251. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-28332-1_21CrossRefGoogle Scholar
  4. 4.
    Goto, K., Bannai, H., Inenaga, S., Takeda, M.: LZD Factorization: simple and practical online grammar compression with variable-to-fixed encoding. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 219–230. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-19929-0_19CrossRefGoogle Scholar
  5. 5.
    Hucke, D., Lohrey, M., Reh, C.P.: The smallest grammar problem revisited. In: Inenaga, S., Sadakane, K., Sakai, T. (eds.) SPIRE 2016. LNCS, vol. 9954, pp. 35–49. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46049-9_4CrossRefGoogle Scholar
  6. 6.
    Jez, A.: Approximation of grammar-based compression via recompression. Theor. Comput. Sci. 592, 115–134 (2015)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Jez, A.: A really simple approximation of smallest grammar. Theor. Comput. Sci. 616, 141–150 (2016)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Larsson, N.J., Moffat, A.: Offline dictionary-based compression. In: Data Compression Conference, DCC 1999, pp. 296–305 (1999)Google Scholar
  9. 9.
    Lohrey, M.: Algorithmics on SLP-compressed strings: a survey. Groups Complex. Cryptol. 4(2), 241–299 (2012)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Miller, V.S., Wegman, M.N.: Variations on a theme by Ziv and Lempel. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. NATO ASI Series, vol. 12, pp. 131–140. Springer, Heidelberg (1985)CrossRefGoogle Scholar
  11. 11.
    Nelson, G., Kieffer, J., Cosman, P.: An interesting hierarchical lossless data compression algorithm. In: IEEE Information Theory Society Workshop (1995)Google Scholar
  12. 12.
    Nevill-Manning, C.G., Witten, I.H.: Identifying hierarchical strcture in sequences: a linear-time algorithm. J. Artif. Intell. Res. (JAIR) 7, 67–82 (1997)zbMATHGoogle Scholar
  13. 13.
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Sakamoto, H.: A fully linear-time approximation algorithm for grammar-based compression. J. Discrete Algorithms 3(2–4), 416–430 (2005)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Storer, J.A., Szymanski, T.G.: The macro model for data compression (extended abstract). In: Proceedings of the 10th Annual ACM Symposium on Theory of Computing, pp. 30–39 (1978)Google Scholar
  16. 16.
    Takabatake, Y., I, T., Sakamoto, H.: A space-optimal grammar compression. In: Proceedings of ESA 2017, pp. 67:1–67:15 (2017)Google Scholar
  17. 17.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Tatsuya Ohno
    • 1
  • Keisuke Goto
    • 2
  • Yoshimasa Takabatake
    • 1
  • Tomohiro I
    • 1
  • Hiroshi Sakamoto
    • 1
  1. 1.Kyushu Institute of TechnologyKitakyushuJapan
  2. 2.Fujitsu Laboratories Ltd.KawasakiJapan

Personalised recommendations