Minimizing Tree Automata for Unranked Trees

  • Wim Martens
  • Joachim Niehren
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3774)


Automata for unranked trees form a foundation for XML schemas, querying and pattern languages. We study the problem of efficiently minimizing such automata. We start with the unranked tree automata (UTAs) that are standard in database theory, assuming bottom-up determinism and that horizontal recursion is represented by deterministic finite automata. We show that minimal UTAs in that class are not unique and that minimization is np-hard. We then study more recent automata classes that do allow for polynomial time minimization. Among those, we show that bottom-up deterministic stepwise tree automata yield the most succinct representations.


Binary Tree Regular Language Tree Automaton Tree Language Pattern Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Angluin, D.: Learning regular sets from queries and counterexamples. Information and Computation 75(2), 87–106 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Brüggemann-Klein, A., Murata, M., Wood, D.: Regular tree and regular hedge languages over unranked alphabets: Version 1, April 3 (2001); Technical Report HKUST-TCSC-2001-0, The Hongkong University of Science and Technology (2001)Google Scholar
  3. 3.
    Carme, J., Lemay, A., Niehren, J.: Learning node selecting tree transducers from completely annotated examples. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 91–102. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Carme, J., Niehren, J., Tommasi, M.: Querying unranked trees with stepwise tree automata. In: van Oostrom, V. (ed.) RTA 2004. LNCS, vol. 3091, pp. 105–118. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Courcelle, B.: On recognizable sets and tree automata. In: Resolution of equations in algebraic structures, pp. 93–126 (1989)Google Scholar
  6. 6.
    Cristau, J., Löding, C., Thomas, W.: Deterministic automata on unranked trees. In: Liśkiewicz, M., Reischuk, R. (eds.) FCT 2005. LNCS, vol. 3623, pp. 68–79. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Frick, M., Grohe, M., Koch, C.: Query evaluation on compressed trees (extended abstract). In: LICS 2003, pp. 188–197 (2003)Google Scholar
  8. 8.
    Gold, E.M.: Complexity of automaton identification from given data. Inform. Control 37, 302–320 (1978)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Jiang, T., Ravikumar, B.: Minimal NFA problems are hard. SIAM Journal on Computing 22(6), 1117–1141 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Kozen, D.: On the Myhill-Nerode theorem for trees. Bulletin of the European Association for Theoretical Computer Science 147, 170–173 (1992)Google Scholar
  11. 11.
    Malcher, A.: Minimizing finite automata is computationally hard. Theoretical Computer Science 327(3), 375–390 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Martens, W.: On minimizing finite automata with very little non-determinism. Manuscript (2005)Google Scholar
  13. 13.
    Martens, W., Neven, F.: Frontiers of tractability for typechecking simple XML transformations. In: PODS 2004, pp. 23–34 (2004)Google Scholar
  14. 14.
    Martens, W., Neven, F., Schwentick, T.: Which XML schemas admit 1-pass preorder typing? In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 68–82. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  15. 15.
    Martens, W., Niehren, J.: Minimizing Tree Automata for Unranked Trees. Full Version,
  16. 16.
    Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML schema languages using formal language theory. ACM Transaction on Internet Technology 5(4) (2005) (to appear)Google Scholar
  17. 17.
    Neven, F., Schwentick, T.: Expressive and efficient pattern languages for tree-structured data. In: PODS 2000, pp. 145–156 (2000)Google Scholar
  18. 18.
    Neven, F., Schwentick, T.: Query automata on finite trees. Theoretical Computer Science 275, 633–674 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Oncina, J., Garcia, P.: Inferring regular languages in polynomial update time. In: Pattern Recognition and Image Analysis, pp. 49–61 (1992)Google Scholar
  20. 20.
    Papakonstantinou, Y., Vianu, V.: DTD inference for views of XML data. In: PODS 2000, pp. 35–46. ACM Press, New York (2000)CrossRefGoogle Scholar
  21. 21.
    Raeymaekers, S., Bruynooghe, M.: Minimization of finite unranked tree automata. Manuscript (2004)Google Scholar
  22. 22.
    Schwentick, T.: XPath query containment. Sigmod Record 33(2), 101–109 (2004)CrossRefGoogle Scholar
  23. 23.
    Seidl, H.: Deciding equivalence of finite tree automata. SIAM Journal on Computing 19(3), 424–437 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Stearns, R.E., Hunt III, H.B.: On the equivalence and containment problems for unambiguous regular expressions, regular grammars and finite automata. SIAM Journal on Computing 14(3), 598–611 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Stockmeyer, L.J., Meyer, A.R.: Word problems requiring exponential time: Preliminary report. In: STOC 1973, pp. 1–9 (1973)Google Scholar
  26. 26.
    Suciu, D.: Typechecking for semistructured data. In: Ghelli, G., Grahne, G. (eds.) DBPL 2001. LNCS, vol. 2397, pp. 1–20. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  27. 27.
    Thatcher, J.W., Wright, J.B.: Generalized finite automata theory with an application to a decision problem of second-order logic. Mathematical Systems Theory 2(1), 57–81 (1968)CrossRefMathSciNetGoogle Scholar
  28. 28.
    World Wide Web Consortium. XML Schema,

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Wim Martens
    • 1
  • Joachim Niehren
    • 2
  1. 1.Hasselt University and Transnational University of LimburgDiepenbeekBelgium
  2. 2.INRIA Futurs, LIFL, Mostrare projectLilleFrance

Personalised recommendations