The VLDB Journal

, Volume 22, Issue 3, pp 369–393

Optimal and efficient generalized twig pattern processing: a combination of preorder and postorder filterings


    • Department of Computer ScienceVŠB—Technical University of Ostrava
  • Michal Krátký
    • Department of Computer ScienceVŠB—Technical University of Ostrava
  • Tok Wang Ling
    • Department of Computer ScienceNational University of Singapore
  • Jiaheng Lu
    • DEKE, MOE and School of InformationRenmin University of China
Regular Paper

DOI: 10.1007/s00778-012-0295-5

Cite this article as:
Bača, R., Krátký, M., Ling, T.W. et al. The VLDB Journal (2013) 22: 369. doi:10.1007/s00778-012-0295-5


Searching for occurrences of a twig pattern query (TPQ) in an XML document is a core task of all XML database query languages. The generalized twig pattern (GTP) extends the TPQ model to include semantics related to output nodes, optional nodes, and boolean expressions which are part of the XQuery language. Preorder filtering holistic algorithms such as TwigStack represent a significant class of TPQ processing approaches with a linear worst-case I/O complexity with respect to the sum of the input and output sizes for some query classes. Another important class of holistic approaches is represented by postorder filtering holistic algorithms such as \(\text{ Twig}^2\)Stack which introduced a linear output enumeration time with respect to the result size. In this article, we introduce a holistic algorithm called GTPStack which is the first approach capable of processing a GTP with a linear worst-case I/O complexity with respect to the GTP result size. This is achieved by using a combination of the preorder and postorder filterings before storing nodes in an intermediate storage. Additionally, another contribution of this article is an introduction of a new perspective of holistic algorithm optimality. We show that the optimality depends not only on a query class but also on XML document characteristics. This new view on the optimality extends the general knowledge about the type of queries for which the holistic algorithms are optimal. Moreover, it allows us to determine that GTPStack is optimal for any GTP when a specific XML document is considered. We present a comprehensive experimental study of the state-of-the-art holistic algorithms showing under which conditions GTPStack outperforms the other holistic approaches.


XML Query processing Generalized twig pattern Holistic algorithms

Copyright information

© Springer-Verlag 2012