Reference Work Entry

Encyclopedia of Database Systems

pp 3585-3591

XML Indexing

  • Xin Luna DongAffiliated withAT&T Labs–Research
  • , Divesh SrivastavaAffiliated withAT&T Labs–Research

Definition

XML employs an ordered, tree-structured model for representing data. Queries in XML languages like XQuery employ twig queries to match relevant portions of data in an XML database. An XML Index is a data structure that is used to efficiently look up all matches of a fragment of the twig query, where some of the twig query fragment nodes may have been mapped to specific nodes in the XML database.

Historical Background

XML path indexing is related to the problem of join indexing in relational database systems [15] and path indexing in object-oriented database systems (see, e.g., [1,9]). These index structures assume that the schema is homogeneous and known; these assumptions do not hold in general for XML data. The DataGuide [7] was the first path index designed specifically for XML data, where the schema may be heterogeneous and may not even be known.

Foundations

Notation

An XML document d is a rooted, ordered, node-labeled tree, where (i) ...

This is an excerpt from the content