Integrating XML Sources into a Data Warehouse

  • Boris Vrdoljak
  • Marko Banek
  • Zoran Skočir
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4055)


Since XML has become a standard for data exchange over the Internet, especially in B2B and B2C communication, there is an increasing need of integrating XML data into data warehousing systems. In this paper we propose a methodology for data warehouse design, when data sources are XML Schemas and conforming XML documents. Particular relevance is given to the conceptual and logical multidimensional design. A prototype tool has been developed to verify and support our methodology. Because of the semi-structured nature of XML data, not all the information needed for design can be safely derived from XML Schema. In these situations, XQuery statements are generated by the tool to examine XML documents. The functionality of the tool is explained on a real-life XML Schema that describes purchase orders.


Dependency Graph Data Warehouse Fact Table Fact Scheme Prototype Tool 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Florescu, D., Kossmann, D.: Storing and Querying XML Data Using an RDBMS. IEEE Data Engineering Bulletin 22(3) (1999)Google Scholar
  2. 2.
    Bohannon, P., Freire, J., Roy, P., Simeon, J.: From XML Schema to Relations: A Cost-Based Approach to XML Storage. In: Proc. of Int’l. Conf. on Data Engineering (ICDE 2002), San Jose, USA (2002)Google Scholar
  3. 3.
    Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data Warehouse Design from XML Sources. In: Proc. Int. Workshop on Data Warehousing and OLAP (DOLAP 2001), pp. 40–47. ACM Press, New York (2001)CrossRefGoogle Scholar
  4. 4.
    Vrdoljak, B., Banek, M., Rizzi, S.: Designing Web Warehouses from XML Schemas. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 89–98. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Jensen, M.R., Møller, T.H., Pedersen, T.B.: Converting XML Data to UML Diagrams for Conceptual Data Integration. In: Int. Workshop Data Integration over the Web (DIWeb 2001), pp. 17–31 (2001)Google Scholar
  6. 6.
    Jensen, R.M., Møller, T.H., Pedersen, T.B.: Specifying OLAP Cubes on XML Data. J. Intelligent Information Systems 17(2-3), 255–280 (2001)MATHCrossRefGoogle Scholar
  7. 7.
    Pedersen, D., Riis, K., Pedersen, T.B.: XML Extended OLAP Querying. In: Proc. Int. Conf. on Scientific and Statistical Database Management (SSDBM 2002), pp. 195–206. IEEE Computer Society Press, Los Alamitos (2002)CrossRefGoogle Scholar
  8. 8.
    Li, Y., An, A.: Representing UML Snowflake Diagram from Integrating XML Data Using XML Schema. In: Proc. Int. Workshop on Data Engineering Issues in E-Commerce (DEEC 2005), pp. 103–111. IEEE Computer Society Press, Los Alamitos (2005)CrossRefGoogle Scholar
  9. 9.
    Open Applications Group (OAG), OAG Integration Specification (OAGIS), Release 7.2.1,
  10. 10.
    Park, B.-K., Han, H., Song, I.-Y.: XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 32–42. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Golfarelli, M., Maio, D., Rizzi, S.: Conceptual design of data warehouses from E/R schemes. In: Proc. Hawaii Int. Conf. on System Sciences (HICSS), vol. VII, pp. 334–343 (1998)Google Scholar
  12. 12.
    Golfarelli, M., Rizzi, S.: Designing the Data Warehouse: Key Steps and Crucial Issues. J. of Computer Science and Information Management 2(3), 1–14 (1999)Google Scholar
  13. 13.
    Golfarelli, M., Maio, D., Rizzi, S.: The Dimensional Fact Model: a Conceptual Model for Data Warehouses. Int. J. of Cooperative Information Systems 7(2-3), 215–247 (1998)CrossRefGoogle Scholar
  14. 14.
    Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational Databases for Querying XML Documents: Limitations and Opportunities. In: Proc. Very Large Data Bases Conf (VLDB 1999), pp. 302–314. Morgan Kaufmann, San Francisco (1999)Google Scholar
  15. 15.
    Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. John Wiley & Sons, New York (2002)Google Scholar
  16. 16.
    World Wide Web Consortium (W3C): XML Schema Part 0: Primer Second Edition (W3C Recommendation, as of October 28, 2004),
  17. 17.
    World Wide Web Consortium (W3C): XQuery 1.0: An XML Query Language (W3C Candidate Recommendation, as of November 3, 2005),

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Boris Vrdoljak
    • 1
  • Marko Banek
    • 1
  • Zoran Skočir
    • 1
  1. 1.Faculty of Electrical Engineering and ComputingUniversity of ZagrebZagrebCroatia

Personalised recommendations