HOPI: An Efficient Connection Index for Complex XML Document Collections

  • Ralf Schenkel
  • Anja Theobald
  • Gerhard Weikum
Conference paper

DOI: 10.1007/978-3-540-24741-8_15

Part of the Lecture Notes in Computer Science book series (LNCS, volume 2992)
Cite this paper as:
Schenkel R., Theobald A., Weikum G. (2004) HOPI: An Efficient Connection Index for Complex XML Document Collections. In: Bertino E. et al. (eds) Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg

Abstract

In this paper we present HOPI, a new connection index for XML documents based on the concept of the 2–hop cover of a directed graph introduced by Cohen et al. In contrast to most of the prior work on XML indexing we consider not only paths with child or parent relationships between the nodes, but also provide space– and time–efficient reachability tests along the ancestor, descendant, and link axes to support path expressions with wildcards in our XXL search engine. We improve the theoretical concept of a 2–hop cover by developing scalable methods for index creation on very large XML data collections with long paths and extensive cross–linkage. Our experiments show substantial savings in the query performance of the HOPI index over previously proposed index structures in combination with low space requirements.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Ralf Schenkel
    • 1
  • Anja Theobald
    • 1
  • Gerhard Weikum
    • 1
  1. 1.Max Planck Institut für InformatikSaarbrückenGermany

Personalised recommendations