Complexity of Decision Problems for Simple Regular Expressions

  • Wim Martens
  • Frank Neven
  • Thomas Schwentick
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3153)


We study the complexity of the inclusion, equivalence, and intersection problem for simple regular expressions arising in practical XML schemas. These basically consist of the concatenation of factors where each factor is a disjunction of strings possibly extended with ‘*’ or ‘?’. We obtain lower and upper bounds for various fragments of simple regular expressions. Although we show that inclusion and intersection are already intractable for very weak expressions, we also identify some tractable cases. For equivalence, we only prove an initial tractability result leaving the complexity of more general cases open. The main motivation for this research comes from database theory, or more specifically XML and semi-structured data. We namely show that all lower and upper bounds for inclusion and equivalence, carry over to the corresponding decision problems for extended context-free grammars and single-type tree grammars, which are abstractions of DTDs and XML Schemas, respectively. For intersection, we show that the complexity only carries over for DTDs.


Regular Expression Intersection Problem Truth Assignment Inclusion Problem Tree Automaton 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abdulla, P.A., Bouajjani, A., Jonsson, B.: On-the-fly analysis of systems with unbounded, lossy FIFO channels. In: Proc. of CAV 1998, pp. 305–318 (1998)Google Scholar
  2. 2.
    Bex, G.J., Neven, F., Van den Bussche, J.: DTDs versus XML Schema: A practical study. To be presented at WebDB 2004Google Scholar
  3. 3.
    Brüggemann-Klein, A., Murata, M., Wood, D.: Regular tree and regular hedge languages over unranked alphabets. Technical Report HKUST-TCSC-2001-0, The Hongkong University of Science and Technology (2001)Google Scholar
  4. 4.
    Brüggemann-Klein, A., Wood, D.: One-unambiguous regular languages. Information and Computation 142(2), 182–206 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Brüggemann-Klein, A., Wood, D.: Caterpillars: A context specification technique. Markup Languages 2(1), 81–106 (2000)CrossRefGoogle Scholar
  6. 6.
    Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.Y.: Reasoning on regular path queries. SIGMOD Record 32(4), 83–92 (2003)CrossRefGoogle Scholar
  7. 7.
    Choi, B.: What are real DTDs like? In: WebDB 2002, pp. 43–48 (2002)Google Scholar
  8. 8.
    World Wide Web Consortium. Extensible Markup Language (XML),
  9. 9.
    World Wide Web Consortium. XML Schema,
  10. 10.
    Hemaspaandra, L., Ogihara, M.: The Complexity Theory Companion. Springer, Heidelberg (2002)zbMATHGoogle Scholar
  11. 11.
    Hosoya, H., Pierce, B.C.: XDuce: A statically typed XML processing language. ACM Transactions on Internet Technology (TOIT) 3(2), 117–148 (2003)CrossRefGoogle Scholar
  12. 12.
    Hunt III, H.B., Rosenkrantz, D.J., Szymanski, T.G.: On the equivalence, containment, and covering problems for the regular and context-free languages. Journal of Computer and System Sciences 12(2), 222–268 (1976)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Kozen, D.: Lower bounds for natural proof systems. In: Proc. FOCS 1977, pp. 254–266. IEEE, Los Alamitos (1977)Google Scholar
  14. 14.
    Martens, W., Neven, F.: Typechecking top-down uniform unranked tree transducers. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 64–78. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Martens, W., Neven, F., Schwentick, T.: Complexity of decision problems for simple regular expressions: Full version,
  16. 16.
    Marx, M.: XPath with conditional axis relations. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 477–494. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  17. 17.
    Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  18. 18.
    Milo, T., Suciu, D., Vianu, V.: Typechecking for XML transformers. Journal of Computer and System Sciences 66(1), 66–97 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Murata, M., Lee, D., Mani, M.: Taxonomy of XML schema languages using formal language theory. In: Extreme Markup Languages, Montreal, Canada (2001)Google Scholar
  20. 20.
    Neven, F.: Automata, logic, and XML. In: Bradfield, J.C. (ed.) CSL 2002 and EACSL 2002. LNCS, vol. 2471, pp. 2–26. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  21. 21.
    Papakonstantinou, Y., Vianu, V.: DTD inference for views of XML data. In: Proc. PODS 2000, pp. 35–46. ACM Press, New York (2000)CrossRefGoogle Scholar
  22. 22.
    Seidl, H.: Deciding equivalence of finite tree automata. SIAM Journal on Computing 19(3), 424–437 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Seidl, H.: Haskell overloading is DEXPTIME-complete. Information Processing Letters 52(2), 57–60 (1994)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Stockmeyer, L.J., Meyer, A.R.: Word problems requiring exponential time: Preliminary report. In: Proc. STOC 1973, pp. 1–9 (1973)Google Scholar
  25. 25.
    van der Vlist, E.: Relax NG. O’Reilly, Sebastopol (2003)Google Scholar
  26. 26.
    Vianu, V.: A web odyssey: From Codd to XML. In: Proc. PODS 2001, pp. 1–15 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Wim Martens
    • 1
  • Frank Neven
    • 1
  • Thomas Schwentick
    • 2
  1. 1.Limburgs Universitair CentrumUniversitaire CampusDiepenbeekBelgium
  2. 2.Fachbereich 12, Mathematik und InformatikPhilipps Universität Marburg 

Personalised recommendations