Skip to main content
Log in

Tree Compression Using String Grammars

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We study the compressed representation of a ranked tree by a (string) straight-line program (SLP) for its preorder traversal, and compare it with the well-studied representation by straight-line context free tree grammars (which are also known as tree straight-line programs or TSLPs). Although SLPs may be exponentially more succinct than TSLPs, we show that many simple tree queries can still be performed efficiently on SLPs, such as computing the height of a tree, tree navigation, or evaluation of Boolean expressions. Other problems on tree traversals turn out to be intractable, e.g. pattern matching and evaluation of tree automata. These problems can be still solved in polynomial time for TSLPs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Actually, in [5] the result is not stated for the convolution \(u_n \otimes v_n\), but for the literal shuffle of \(u_n\) and \(v_n\) which is \(u_n[1] v_n[1] u_n[2] v_n[2] \cdots u_n[m] v_n[m]\). But this makes no difference, since the sizes of the smallest SLPs for the convolution and literal shuffle, respectively, of two words differ only by multiplicative constants.

  2. Let us explain the differences to [3, proof of Thm. 4.1]: In [3], the arithmetic expression is given by a circuit instead of an SLP. This simplifies the proof, because if we replace in the above language L the SLP \({\mathbb {A}}\) by a circuit, then we can decide the language L in polynomial time (we only have to evaluate a circuit modulo a prime number with polynomially many bits). In our situation, we can only show that L belongs to a certain level of the counting hierarchy. But this suffices to prove the theorem, only the level in the counting hierarchy increases by the number of levels in which the set L sits.

References

  1. Akutsu, T.: A bisection algorithm for grammar-based compression of ordered trees. Inf. Process. Lett. 110(18–19), 815–820 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  2. Allender, E., Balaji, N., Datta, S.: Low-depth uniform threshold circuits and the bit-complexity of straight line programs. In: Proceedings of MFCS 2014, Part II, volume 8635 of Lecture Notes in Computer Science, pp. 13–24. Springer (2014)

  3. Allender, E., Bürgisser, P., Kjeldgaard-Pedersen, J., Bro Miltersen, P.: On the complexity of numerical analysis. SIAM J. Comput. 38(5), 1987–2006 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  4. Allender, E., Wagner, K.W.: Counting hierarchies: polynomial time and constant depth circuits. Bull. EATCS 40, 182–194 (1990)

    MATH  Google Scholar 

  5. Bertoni, A., Choffrut, C., Radicioni, R.: Literal shuffle of compressed words. In: Proceedings of IFIP TCS 2008, volume 273 of IFIP, pp. 87–100. Springer (2008)

  6. Benoit, D., Demaine, E.D., Munro, J.I., Raman, R., Raman, V., Rao, S.S.: Representing trees of higher degree. Algorithmica 43(4), 275–292 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bille, P., Gørtz, I.L., Landau, G.M., Weimann, O.: Tree compression with top trees. Inf. Comput. 243, 166–177 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings and trees. SIAM J. Comput. 44(3), 513–539 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Bousquet-Mélou, M., Lohrey, M., Maneth, S., Noeth, E.: XML compression via DAGs. Theory Comput. Syst. 57, 1322 (2014)

    Article  MATH  Google Scholar 

  10. Busatto, G., Lohrey, M., Maneth, S.: Efficient memory representation of XML document trees. Inf. Syst. 33(4–5), 456–474 (2008)

    Article  MATH  Google Scholar 

  11. Buss, S.R.: The Boolean formula value problem is in ALOGTIME. In: Proceedings of STOC 1987, pp. 123–131. ACM Press (1987)

  12. Caussinus, H., McKenzie, P., Thérien, D., Vollmer, H.: Nondeterministic \(NC^1\) computation. J. Comput. Syst. Sci. 57(2), 200–212 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  13. Charikar, M., Lehman, E., Lehman, A., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  14. Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Löding, Tison, S., Tommasi, M.: Tree automata techniques and applications. http://tata.gforge.inria.fr/

  15. Esparza, J., Luttenberger, M., Schlund, M.: A brief history of strahler numbers. In: Proceedings of LATA 2014, volume 8370 of Lecture Notes in Computer Science, pp. 1–13. Springer (2014)

  16. Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and indexing labeled trees with applications. J. ACM 57(1), 4 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  17. Flajolet, P., Raoult, J.-C., Vuillemin, J.: The number of registers required for evaluating arithmetic expressions. Theor. Comput. Sci. 9, 99–125 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  18. Ganardi, M., Hucke, D., Jeż, A., Lohrey, M., Noeth, E.: Constructing small tree grammars and small circuits for formulas. J. Comput. Syst. Sci. (2017). doi:10.1016/j.jcss.2016.12.007

  19. Ganardi, M., Hucke, D., Lohrey, M., Noeth, E.: Tree compression using string grammars. In: Proceedings of LATIN 2016, volume 9644 of Lecture Notes in Computer Science, pp. 590–604. Springer (2016)

  20. Hagenah, C.: Gleichungen mit regulären Randbedingungen über freien Gruppen. PhD thesis, University of Stuttgart, Institut für Informatik (2000)

  21. Hesse, W., Allender, E., Mix Barrington, D.A.: Uniform constant-depth threshold circuits for division and iterated multiplication. J. Comput. Syst. Sci. 65, 695–716 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  22. Hübschle-Schneider, L., Raman, R.: Tree compression with top trees revisited. In: Proceedings of SEA 2015, volume 9125 of Lecture Notes in Computer Science, pp. 15–27. Springer (2015)

  23. Hucke, D., Lohrey, M., Noeth, E.: Constructing small tree grammars and small circuits for formulas. In: Proceedings of FSTTCS 2014, volume 29 of LIPIcs, pp. 457–468. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, (2014)

  24. Jacobson, G.: Space-efficient static trees and graphs. In: Proceedings of FOCS 1989, pp. 549–554. IEEE Computer Society (1989)

  25. Jansson, J., Sadakane, K., Sung, W.-K.: Ultra-succinct representation of ordered trees with applications. J. Comput. Syst. Sci. 78(2), 619–631 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  26. Jeż, A.: Approximation of grammar-based compression via recompression. Theor. Comput. Sci. 592, 115–134 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  27. Jeż, A.: Faster fully compressed pattern matching by recompression. ACM Trans. Algorithms 11(3), 20:1–20:43 (2015)

    MathSciNet  MATH  Google Scholar 

  28. Jeż, A.: A really simple approximation of smallest grammar. Theor. Comput. Sci. 616, 141–150 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  29. Jeż, A., Lohrey, M.: Approximation of smallest linear tree grammars. In Procedings of STACS 2014, volume 25 of LIPIcs, pp. 445–457. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2014)

  30. Kobayashi, N., Matsuda, K., Shinohara, A.: Functional programs as compressed data. In: Proceedings of PEPM 2012, pp. 121–130. ACM Press (2012)

  31. Lohrey, M.: On the parallel complexity of tree automata. In: Proceedings of RTA 2001, volume 2051 of Lecture Notes in Computer Science, pp. 201–215. Springer (2001)

  32. Lohrey, M.: Leaf languages and string compression. Inf. Comput. 209(6), 951–965 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  33. Lohrey, M.: The Compressed Word Problem for Groups. Springer, Berlin (2014)

    Book  MATH  Google Scholar 

  34. Lohrey, M.: Grammar-based tree compression. In: Proceedings of DLT 2015, volume 9168 of Lecture Notes in Computer Science, pp. 46–57. Springer (2015)

  35. Lohrey, M., Maneth, S.: The complexity of tree automata and XPath on grammar-compressed trees. Theor. Comput. Sci. 363(2), 196–210 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  36. Lohrey, M., Maneth, S., Mennicke, R.: XML tree structure compression using RePair. Inf. Syst. 38(8), 1150–1167 (2013)

    Article  Google Scholar 

  37. Lohrey, M., Maneth, S., Schmidt-Schauß, M.: Parameter reduction and automata evaluation for grammar-compressed trees. J. Comput. Syst. Sci. 78(5), 1651–1669 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  38. Lohrey, M., Mathissen, C.: Isomorphism of regular trees and words. Inf. Comput. 224, 71–105 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  39. Munro, J.I., Raman, V.: Succinct representation of balanced parentheses and static trees. SIAM J. Comput. 31(3), 762–776 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  40. Navarro, G., Ordó nez Pereira, A.: Faster compressed suffix trees for repetitive text collections. In: Proceedings of SEA 2014, volume 8504 of Lecture Notes in Computer Science, pp. 424–435. Springer (2014)

  41. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  42. Sakamoto, H.: A fully linear-time approximation algorithm for grammar-based compression. J. Discrete Algorithms 3(2–4), 416–430 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  43. Schmidt-Schauß, M.: Linear compressed pattern matching for polynomial rewriting (extended abstract). In: Proceedings of TERMGRAPH 2013, volume 110 of EPTCS, pp. 29–40 (2013)

  44. Toda, S.: PP is as hard as the polynomial-time hierarchy. SIAM J. Comput. 20(5), 865–877 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  45. Vollmer, H.: Introduction to Circuit Complexity. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

Download references

Acknowledgements

Funding was provided by Deutsche Forschungsgemeinschaft (Grant No. LO 748/10-1)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moses Ganardi.

Additional information

Markus Lohrey and Eric Noeth are supported by the DFG-project LO 748/10-1 (QUANT-KOMP).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ganardi, M., Hucke, D., Lohrey, M. et al. Tree Compression Using String Grammars. Algorithmica 80, 885–917 (2018). https://doi.org/10.1007/s00453-017-0279-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-017-0279-3

Keywords

Navigation