FliX: A Flexible Framework for Indexing Complex XML Document Collections

  • Ralf Schenkel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3268)

Abstract

While there are many proposals for path indexes on XML documents, none of them is perfectly suited for indexing large-scale collections of interlinked XML documents. Existing strategies lack support for links, require large amounts of time to build or space to store the index, or cannot efficiently answer connection queries. This paper presents the FliX framework for connection indexing that supports large, heterogeneous document collections with links, using the existing path indexes as building blocks. We introduce some example configurations of the framework that are appropriate for many important application scenarios. Experiments show the feasibility of our approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chung, C.-W., et al.: APEX: An adaptive path index for XML data. In: SIGMOD, pp. 121–132 (2002)Google Scholar
  2. 2.
    Cohen, E., et al.: Reachability and distance queries via 2-hop labels. In: SODA, pp. 937–946 (2002)Google Scholar
  3. 3.
    Cooper, B., et al.: A fast index for semistructured data. In: VLDB, pp. 341–350 (2001)Google Scholar
  4. 4.
    Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: VLDB, pp. 436–445 (1997)Google Scholar
  5. 5.
    Grust, T.: Accelerating XPath location steps. In: SIGMOD, pp. 109–120 (2002)Google Scholar
  6. 6.
    Kaushik, R., et al.: Covering indexes for branching path queries. In: SIGMOD, pp. 133–144 (2002)Google Scholar
  7. 7.
    Qun, C., et al.: D(k)-index: An adaptive structural summary for graph-structured data. In: SIGMOD, pp. 134–144 (2003)Google Scholar
  8. 8.
    Schenkel, R., Theobald, A., Weikum, G.: Ontology-enabled XML search. In: Blanken, H.M., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 119–131. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Schenkel, R., Theobald, A., Weikum, G.: HOPI: An efficient connection index for complex XML document collections. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 237–255. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Theobald, A., Weikum, G.: Adding Relevance to XML. In: Suciu, D., Vossen, G. (eds.) WebDB 2000. LNCS, vol. 1997, pp. 105–124. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Theobald, A., Weikum, G.: The index-based XXL search engine for querying XML data with relevance ranking. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 477–495. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Ralf Schenkel
    • 1
  1. 1.Max-Planck-Institut für InformatikSaarbrückenGermany

Personalised recommendations