Definition
XML employs an ordered, tree-structured model for representing data. Queries in XML languages like XQuery employ twig queries to match relevant portions of data in an XML database. An XML Index is a data structure that is used to efficiently look up all matches of a fragment of the twig query, where some of the twig query fragment nodes may have been mapped to specific nodes in the XML database.
Historical Background
XML path indexing is related to the problem of join indexing in relational database systems [15] and path indexing in object-oriented database systems (see, e.g., [1,9]). These index structures assume that the schema is homogeneous and known; these assumptions do not hold in general for XML data. The DataGuide [7] was the first path index designed specifically for XML data, where the schema may be heterogeneous and may not even be known.
Foundations
Notation
An XML document dis a rooted, ordered, node-labeled tree, where (i) each node corresponds to an XML...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Bertino E. and Kim W. Indexing techniques for queries on nested objects. IEEE Trans. Knowledge and Data Eng., 1(2):196–214, 1989.
Bruno N., Koudas N., and Srivastava D. Holistic twig joins: optimal XML pattern matching. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 310–321.2002,
Chen Z., Gehrke J., Korn F., Koudas N., Shanmugasundaram J., and Srivastava D. Index structures for matching XML twigs using relational query processors. Data Knowl. Eng., 60(2):283–302, 2007.
Chung C.-W., Min J.-K., and Shim K. APEX: an adaptive path index for XML data. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 121–132.2002,
Cohen E., Kaplan H., and Milo T. Labeling dynamic XML trees. In Proc. ACM SIGACT-SIGMOD Symp. on Principles of Database Systems, pp. 271–281.2002,
Cooper B.F., Sample N., Franklin M.J., Hjaltason G.R., and Shadmon M. A fast index for semistructured data. In Proc. 27th Int. Conf. on Very Large Data Bases, pp. 341–350.2001,
Goldman R. and Widom J. DataGuides: enabling query formulation and optimization in semistructured databases. In Proc. 23th Int. Conf. on Very Large Data Bases, pp. 436–445. 1997,
Grust T. Accelerating XPath location steps. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 109–120.2002,
Kemper A. and Moerkotte G. Access support in object bases. ACM SIGMOD Rec., 19(2):364–374, 1990.
Kha D.D., Yoshikawa M., and Uemura S. An XML indexing structure with relative region coordinate. In Proc. 17th Int. Conf. on Data Engineering, pp. 313–320.2001,
McHugh J. and Widom J. Query optimization for XML. In Proc. 25th Int. Conf. on Very Large Data Bases, pp. 315–326.1999,
Milo T. and Suciu D. Index structures for path expressions. In ICDT. Springer-Verlag, London, UK, 1999, pp. 277–295.
Rao P. and Moon B. PRIX: Indexing and querying XML using Pruffer sequences. In ICDE. IEEE Computer Society, WA, USA, 2004, pp. 288.
Tatarinov I., Viglas S., Beyer K., Shanmugasundaram J., Shekita E., and Zhang C. Storing and querying ordered XML using a relational database system. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 204–215.2002,
Valduriez P. Join indices. ACM Trans. Database Syst., 12(2):218–246, 1987.
Wang H., Park S., Fan W., and Yu P. ViST: a dynamic index method for querying XML data by tree structures. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 110–121.2003,
Yoshikawa M., Amagasa T., Shimura T., and Uemura S. XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Tech., 1(1):110–141, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this entry
Cite this entry
Dong, X., Srivastava, D. (2009). XML Indexing. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_779
Download citation
DOI: https://doi.org/10.1007/978-0-387-39940-9_779
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering