Efficient Relational Storage and Retrieval of XML Documents

  • Albrecht Schmidt
  • Martin Kersten
  • Menzo Windhouwer
  • Florian Waas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1997)

Abstract

In this paper, we present a data and an execution model that allow for efficient storage and retrieval of XML documents in a relational database. The data model is strictly based on the notion of binary associations: by decomposing XML documents into small, flexible and semantically homogeneous units we are able to exploit the performance potential of vertical fragmentation. Moreover, our approach provides clear and intuitive semantics, which facilitates the definition of a declarative query algebra. Our experimental results with large collections of XML documents demonstrate the effectiveness of the techniques proposed.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The Lorel Query Language for Semistructured Data. International Journal on Digital Libraries, 1(1):68–88, 1997.CrossRefGoogle Scholar
  2. 2.
    C. Beeri and Y. Tzaban. SAL: An Algebra for Semistructured Data and XML. In International Workshop on the Web and Databases, pages 37–42, Pennsylvania, USA, 1999.Google Scholar
  3. 3.
    P. A. Boncz and M. L. Kersten. MIL Primitives for Querying a Fragmented World. The VLDB Journal, 8(2):101–119, 1999.CrossRefGoogle Scholar
  4. 4.
    A. Bonifati and S. Ceri. Comparative Analysis of Five XML Query Languages. ACM SIGMOD Record, 1(29):68–79, 2000.CrossRefGoogle Scholar
  5. 5.
    J. Bosak. Sample XML documents. shakespeare.1.01.xml.zip, available at ftp://sunsite.unc.edu/pub/sun-info/standards/xml/eg/.
  6. 6.
    P. Buneman, S. B. Davidson, G. G. Hillebrand, and D. Suciu. A Query Language and Optimization Techniques for Unstructured Data. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 505–516, Montreal, Canada, 1996.Google Scholar
  7. 7.
    A. Deutsch, M. F. Fernandez, and D. Suciu. Storing Semistructured Data with STORED. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 431–442, Philadephia, PA, USA, 1999.Google Scholar
  8. 8.
    M. Dyck. The GNU version of The Collaborative International Dictionary of English, presented in the Extensible Markup Language. Available at http://metalab.unc.edu/webster/.
  9. 9.
    D. Florescu and D. Kossmann. Storing and Querying XML Data Using an RDBMS. Data Engineering Bulletin, 22(3), 1999.Google Scholar
  10. 10.
    R. Goldman and J. Widom. Dataguides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 436–445, Athens, Greece, 1997.Google Scholar
  11. 11.
    C. Kanne and G. Moerkotte. Efficient Storage of XML Data. In Proceedings of the 16th International Conference on Data Engineering, page 198, 2000.Google Scholar
  12. 12.
  13. 13.
    J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A Database Management System for Semistructured Data. ACM SIGMOD Record, 3(26), 1997.Google Scholar
  14. 14.
    J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, and J. Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 302–314, Edinburgh, UK, 1999.Google Scholar
  15. 15.
    T. Shimura, M. Yoshikawa, and S. Uemura. Storage and Retrieval of XML Documents Using Object-Relational Databases. In Database and Expert Systems Applications, pages 206–217. Springer, 1999.Google Scholar
  16. 16.
    Software AG. Tamino-Technical Description. Available at http://www.softwareag.com/tamino/technical/description.htm.
  17. 17.
    R. van Zwol, P. Apers, and A. Wilschutz. Implementing semi-structured data with MOA. In Workshop on Query Processing for Semistructured data and Non-Standard Data Formats (in conjunction with ICDT), 1999.Google Scholar
  18. 18.
    W3C. Document Object Model (DOM). Available at http://www.w3.org/DOM/.
  19. 19.
    W3C. Extensible Markup Language (XML) 1.0. Available at http://www.w3.org/TR/1998/REC-xml-19980210.

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Albrecht Schmidt
    • 1
  • Martin Kersten
    • 1
  • Menzo Windhouwer
    • 1
  • Florian Waas
    • 1
  1. 1.Centre for Mathematics and Computer Science (CWI)AmsterdamThe Netherlands

Personalised recommendations