ENAXS: Efficient Native XML Storage System

  • Khin-Myo Win
  • Wee-Keong Ng
  • Ee-Peng Lim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2642)

Abstract

XML is a self-describing meta-language and fast emerging as a dominant standard for Web data exchange among various applications. With the tremendous growth of XML documents, an efficient storage system is required to manage them. The conventional databases, which require all data to adhere to an explicitly specified rigid schema, are unable to provide an efficient storage for tree-structured XML documents. A new data model that is specifically designed for XML documents is required. In this paper, we propose a new storage system, named Efficient Native XML Storage System (ENAXS), for large and complex XML documents. ENAXS stores all XML documents in its native format to overcome the deficiencies of the conventional databases, achieve optimal storage utilization and support efficient query processing. In addition, we propose a path-based indexing scheme which is embedded in ENAXS for fast data retrieval. We have implemented ENAXS and evaluated its performance with real data sets. Experimental results show the efficiency and scalability of the proposed system in utilizing storage space and executing various types of queries.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    D. Alin, F. Mary, and D. Suciu. Storing Semistructured Data with STORED. SIGMOD Record, pages 431–442, 1999.Google Scholar
  2. 2.
    V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From Structured Documents to Novel Query Facilities. In Proc. ACM SIGMOD Conf., Minneapolis, Minnesota, May 1994.Google Scholar
  3. 3.
    C. W. Chung, J. K. Min, and K. Shim. APEX: An Adaptive Path Index for XML Data. ACM SIGMOD, 4(6), June 2002.Google Scholar
  4. 4.
    T. S. Chung, S. Park, S. Y. Han, and H. J. Kim. Extracting Object-Oriented Database Schemas from XML DTDs Using Inheritance. In Proc. 2nd Int. Conf. EC-Web, Munich, Germany, September 2001.Google Scholar
  5. 5.
    B. F. Cooper, S. Neal, J. F. Michael, R. H. Gisli, and S. Moshe. A Fast Index for Semistructured Data. In Proc. 27th Int. Conf. on Very Large Data Bases, pages 341–350, Roma, Italy, 2001.Google Scholar
  6. 6.
    D. Florescu and D. Kossmann. Storing and Querying XML Data using an RDBMS. In Bullettin of the Technical Committee on Data Engineering, 22(3):27–34, September 1999.Google Scholar
  7. 7.
    R. Goldman and J. Widom. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proc. 23rd Int. Conf. on Very Large Data Bases, Athens, Greece, 1997.Google Scholar
  8. 8.
    H. V. Jagadish, Shurug AI-Khalifa, Laks V. S., Andrew Nierman, Stylianos Paparizons, Jignesh Patel, Divesh Srivastava, and Yuqing Wu. TIMBER: A Native XML Database. VLDB Journal (To appear), 2002.Google Scholar
  9. 9.
    S. Jayavel, T. Kristin, H. Gang, Z. Chun, D. David, and N. Jeffrey. Relational Databases for Querying XML Documents: Limitations and Opportunities. In Proc. 25th Int. Conf. on Very Large Data Bases, Edinburgh, Scotland, 1999.Google Scholar
  10. 10.
    C. C. Kanne and G. Moerkotte. Efficient Storage of XML Data. In Proc. 16th Int. Conf. on Data Engineering, San Diego, CA, February 2000.Google Scholar
  11. 11.
    M. Klettke and H. Meyer. XML and Object Relational Database Systems Enchancing Structural Mapping Based on Statistics. In Int. Workshop on the Web and Database (WebDB), Dallas, 2000.Google Scholar
  12. 12.
    K. Loney and G. Koch. Oracle 8i: The Complete Reference. McGrawHill, 2000.Google Scholar
  13. 13.
    J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A Database Management System for Semistructured Data. SIGMOD Record, 26(3), September 1997.Google Scholar
  14. 14.
    D. Quass, J. Widom, R. Goldman, K. Haas, Q. Luoan J. MchHugh, S. Nestorov, A. Rajaraman, H. Rivero, S. Abiteboul, J. Ullman, and J. Wiener. LORE: A Lightweight Object REpository for Semistructured Data. ACM SIGMOD, 25(2):549–549, June 1996.CrossRefGoogle Scholar
  15. 15.
    M. Rays. Bringing the Internet to Your Database: Using SQL Server 2000 and XML to Bulid Loosely-Coupled Systems. In Proc. 17th IEEE Int. Conf. on Data Engineering, Heidelberg, Germany, April 2001.Google Scholar
  16. 16.
    F. Rizzolo and A. Mendelzon. Indexing XML Data with ToXin. In Proc. 4th Int. Workshop on the Web and Database (in Conjunction with ACM SIGMOD), Santa Barbara, CA, May 2001.Google Scholar
  17. 17.
    H. Schoning. Tamino: A DBMS Designed for XML. In Proc. 17th Int. Conf. on Data Engineering, pages 149–154, Heidelberg, Germany, April 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Khin-Myo Win
    • 1
  • Wee-Keong Ng
    • 1
  • Ee-Peng Lim
    • 1
  1. 1.Centre for Advanced Information Systems, School of Computer EngineeringNTUSingaporeSingapore

Personalised recommendations