The Smallest Grammar Problem Revisited

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9954)


In a seminal paper of Charikar et al. on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here we close the gaps for LZ78 and BISECTION by showing that the approximation ratio of LZ78 is \(\varTheta ( (n/\log n)^{2/3})\), whereas the approximation ratio of BISECTION is \(\varTheta ( (n/\log n)^{1/2})\). We also derive a lower bound for a smallest grammar for a word in terms of its number of LZ77-factors, which refines existing bounds of Rytter. Finally, we improve results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets.


  1. 1.
    Arpe, J., Reischuk, R.: On the complexity of optimal grammar-based compression. In: Proceedings of the DCC 2006, pp. 173–182. IEEE Computer Society (2006)Google Scholar
  2. 2.
    Berstel, J., Brlek, S.: On the length of word chains. Inf. Process. Lett. 26(1), 23–28 (1987)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Casel, K., Fernau, H., Gaspers, S., Gras, B., Schmid, M.L.: On the complexity of grammar-based compression over fixed alphabets. In: Proceeding ICALP 2016, LNCS. Springer, Heidelberg (2016, to appear)Google Scholar
  4. 4.
    Charikar, M., Lehman, E., Lehman, A., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Diwan, A.A.: A New Combinatorial Complexity Measure for Languages. Tata Institute, Bombay (1986)Google Scholar
  6. 6.
    Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient algorithms for Lempel-Ziv encoding (extended abstract). In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  7. 7.
    Jeż, A.: Approximation of grammar-based compression via recompression. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 165–176. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Kieffer, J.C., Yang, E.-H.: Grammar-based codes: a new class of universal lossless source codes. IEEE Trans. Inf. Theory 46(3), 737–754 (2000)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Kieffer, J.C., Yang, E.-H., Nelson, G.J., Cosman, P.C.: Universal lossless compression via multilevel pattern matching. IEEE Trans. Inf. Theory 46(4), 1227–1245 (2000)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Larsson, N.J., Moffat, A.: Offline dictionary-based compression. In: Proceedings of the DCC 1999, pp. 296–305. IEEE Computer Society (1999)Google Scholar
  11. 11.
    Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Springer, Heidelberg (2008)CrossRefMATHGoogle Scholar
  12. 12.
    Lohrey, M.: The Compressed Word Problem for Groups. Springer, Heidelberg (2014)CrossRefMATHGoogle Scholar
  13. 13.
    Nevill-Manning, C.G., Witten, I.H.: Identifying hierarchical structure in sequences: a linear-time algorithm. J. Artif. Intell. Res. 7, 67–82 (1997)MATHGoogle Scholar
  14. 14.
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Storer, J.A., Szymanski, T.G.: Data compression via textual substitution. J. ACM 29(4), 928–951 (1982)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Tabei, Y., Takabatake, Y., Sakamoto, H.: A succinct grammar compression. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 235–246. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  17. 17.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1977)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.University of SiegenSiegenGermany

Personalised recommendations