On Two LZ78-style Grammars: Compression Bounds and Compressed-Space Computation

  • Golnaz Badkobeh
  • Travis Gagie
  • Shunsuke Inenaga
  • Tomasz Kociumaka
  • Dmitry Kosolobov
  • Simon J. Puglisi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10508)

Abstract

We investigate two closely related LZ78-based compression schemes: LZMW (an old scheme by Miller and Wegman) and LZD (a recent variant by Goto et al.). Both LZD and LZMW naturally produce a grammar for a string of length n; we show that the size of this grammar can be larger than the size of the smallest grammar by a factor \(\varOmega (n^{\frac{1}{3}})\) but is always within a factor \(O((\frac{n}{\log n})^{\frac{2}{3}})\). In addition, we show that the standard algorithms using \(\varTheta (z)\) working space to construct the LZD and LZMW parsings, where z is the size of the parsing, work in \(\varOmega (n^{\frac{5}{4}})\) time in the worst case. We then describe a new Las Vegas LZD/LZMW parsing algorithm that uses \(O (z \log n)\) space and \(O(n + z \log ^2 n)\) time w.h.p.

Keywords

LZMW LZD LZ78 Compression Smallest grammar 

References

  1. 1.
    Supplementary materials for the present paper: C++ code for described experiments. https://bitbucket.org/dkosolobov/lzd-lzmw
  2. 2.
    Belazzougui, D., Boldi, P., Vigna, S.: Dynamic Z-Fast tries. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 159–172. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16321-0_15 CrossRefGoogle Scholar
  3. 3.
    Belazzougui, D., Cording, P.H., Puglisi, S.J., Tabei, Y.: Access, rank, and select in grammar-compressed strings. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 142–154. Springer, Heidelberg (2015). doi:10.1007/978-3-662-48350-3_13 CrossRefGoogle Scholar
  4. 4.
    Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings and trees. SIAM J. Comput. 44(3), 513–539 (2015)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theor. 51(7), 2554–2576 (2005)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Claude, F., Navarro, G.: Self-indexed grammar-based compression. Fundamenta Informaticae 111(3), 313–337 (2011)MathSciNetMATHGoogle Scholar
  7. 7.
    Gagie, T., Gawrychowski, P., Kärkkäinen, J., Nekrich, Y., Puglisi, S.J.: A faster grammar-based self-index. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 240–251. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28332-1_21 CrossRefGoogle Scholar
  8. 8.
    Goto, K., Bannai, H., Inenaga, S., Takeda, M.: LZD Factorization: simple and practical online grammar compression with variable-to-fixed encoding. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 219–230. Springer, Cham (2015). doi:10.1007/978-3-319-19929-0_19
  9. 9.
    Hucke, D., Lohrey, M., Reh, C.P.: The smallest grammar problem revisited. In: Inenaga, S., Sadakane, K., Sakai, T. (eds.) SPIRE 2016. LNCS, vol. 9954, pp. 35–49. Springer, Cham (2016). doi:10.1007/978-3-319-46049-9_4 CrossRefGoogle Scholar
  10. 10.
    I, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Efficient Lyndon factorization of grammar compressed text. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 153–164. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38905-4_16
  11. 11.
    Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Devel. 31(2), 249–260 (1987)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Kempa, D., Kosolobov, D.: LZ-End parsing in compressed space. In: Proceedings of Data Compression Conference (DCC), pp. 350–359. IEEE (2017)Google Scholar
  13. 13.
    Kreft, S., Navarro, G.: On compressing and indexing repetitive sequences. Theoret. Comput. Sci. 483, 115–133 (2013)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Miller, V.S., Wegman, M.N.: Variations on a theme by Ziv and Lempel. In: Apostolico, A., Galil, Z. (eds.) Proceedings of NATO Advanced Research Workshop on Combinatorial Algorithms on Words, NATO ASI, vol. 12, pp. 131–140. Springer, Heidelberg (1985)Google Scholar
  15. 15.
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoret. Comput. Sci. 302(1–3), 211–222 (2003)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Tanaka, T., I, T., Inenaga, S., Bannai, H., Takeda, M.: Computing convolution on grammar-compressed text. In: Proceedings of Data Compression Conference (DCC), pp. 451–460. IEEE (2013)Google Scholar
  17. 17.
    Westbrook, J.: Fast incremental planarity testing. In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 342–353. Springer, Heidelberg (1992). doi:10.1007/3-540-55719-9_86 CrossRefGoogle Scholar
  18. 18.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theor. 24(5), 530–536 (1978)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. 23(3), 337–343 (1977)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Golnaz Badkobeh
    • 1
  • Travis Gagie
    • 2
  • Shunsuke Inenaga
    • 3
  • Tomasz Kociumaka
    • 4
  • Dmitry Kosolobov
    • 5
  • Simon J. Puglisi
    • 5
  1. 1.Department of Computer ScienceUniversity of WarwickCoventryEngland
  2. 2.CeBiB, EITDiego Portales UniversitySantiagoChile
  3. 3.Department of InformaticsKyushu UniversityFukuokaJapan
  4. 4.Institute of InformaticsUniversity of WarsawWarsawPoland
  5. 5.Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland

Personalised recommendations