Cluster Computing

, Volume 10, Issue 1, pp 95–109 | Cite as

A high performance integrated web data warehousing

Article

Abstract

Over the years, we have seen a significant number of integration techniques for data warehouses to support web integrated data. However, the existing works focus extensively on the design concept. In this paper, we focus on the performance of a web database application such as an integrated web data warehousing using a well-defined and uniform structure to deal with web information sources including semi-structured data such as XML data, and documents such as HTML in a web data warehouse system. By using a case study, our implementation of the prototype is a web manipulation concept for both incoming sources and result outputs. Thus, the system not only can be operated through the web, it can also handle the integration of web data sources and structured data sources. Our main contribution is the performance evaluation of an integrated web data warehouse application which includes two tasks. Task one is to perform a verification of the correctness of integrated data based on the result set that is retrieved from the web integrated data warehouse system using complex and OLAP queries. The result set is checked against the result set that is retrieved from the existing independent data source systems. Task two is to measure the performance of OLAP or complex query by investigating source operation functions used by these queries to retrieve the data. The information of source operation functions used by each query is obtained using the TKPROF utility.

Keywords

Integrated web data warehouse performance Performance evaluation of web complex query 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bishay, L., Taniar, D., Jiang, Y., Rahayu, W.: Structured web pages management for efficient data retrieval. In: Proceedings of the 1st International Conference on Web Information Systems Engineering (WISE ’00), Hong Kong, China, pp. 97–104. IEEE Computer Society (2000) Google Scholar
  2. 2.
    Bonifati, A., Cattaneo, F., Ceri, S., Fuggetta, A., Paraboschi, S.: Designing data marts for data warehouses. In: ACM Transactions on Software Engineering and Methodology (TOSEM), 2001, pp. 452–481 Google Scholar
  3. 3.
    Breitbart, Y., Olson, Y., Thompson, G.: Database integration in a distributed heterogeneous data system. In: Proceedings of the 2nd IEEE International Conference on Data Engineering, 1986, pp. 301–310 Google Scholar
  4. 4.
    Buzydlowski, W.J.: A framework for object oriented on-line analytic processing. In: Proceedings of the 1st ACM International Workshop on Data Warehousing and OLAP (DOLAP), 1998, pp. 10–15 Google Scholar
  5. 5.
    Byung, P., Han, H., Song, Y.: XML-OLAP: a multidimensional analysis framework for XML warehouses. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWak ’05), 2005, pp. 32–42 Google Scholar
  6. 6.
    Cabibbo, L., Torlone, A.R.: A logical approach to multidimensional databases. In: Proceedings of the 6th International Conference on Extending Database Technology, Advances in Database Technology, 1998, pp. 183–197 Google Scholar
  7. 7.
    Calvanese, D., Giacomo, De.G., Lenzerini, M., Rosati, N.D.: Source integration in data warehouse. In: Proceedings of the 9th International Workshop on Database and Expert Systems Applications (DEXA ‘98), 1998, pp. 192–197 Google Scholar
  8. 8.
    Chen, W., Hong, T., Lin, W.W.: Using the compressed data model in object-oriented data warehousing. In: Proceedings of IEEE International Conference on Systems, Man, Cybernetics (IEEE SMC ’99), 1999, pp. 768–772 Google Scholar
  9. 9.
    Ezeife, I.C., Ohanekwu, E.T.: The use of smart tokens in cleaning integrated warehouse data. Int. J. Data Warehous. Min. 1(2), 1–22 (2005) Google Scholar
  10. 10.
    Le, D.X., Rahayu, J.W.: A dynamic approach for integrating web data warehouses. In: Proceedings of International Conference on Computational Science and Its Application (ICCSA ’06), pp. 207–216. Springer-Verlag, Berlin/Heidelberg (2006) Google Scholar
  11. 11.
    Filho, H.A., Prado, H.A., Toscani, S.S.: Evolving a legacy data warehouse system to an object oriented architecture. In: Proceedings of the XX International Conference of the Chilean Computer Science Society (SCCC ’00), Santiago, Chile, pp. 32–40. IEEE-Computer Society (2000) Google Scholar
  12. 12.
    Gopalkrishman, V., Li, Q., Karlapalem, K.: Issues of object relational view design in data warehousing environment. In: Proceedings of the IEEE Conference on Systems Man and Cybernetics (SMC ’98), 1998, pp. 2732–2737 Google Scholar
  13. 13.
    Golfarelli, M., Rizzi, S., Birdoljak, B.: A conceptual design of data warehouses from E/R schema. In: Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences (HICSS ’98), Kohala Coast, Hawaii, USA, pp. 334–344. IEEE Computer Society (1998) Google Scholar
  14. 14.
    Golfarelli, M., Rizzi, S., Birdoljak, B.: Data warehousing from XML sources. In: Proceedings of the 4th ACM International Workshop on Data Warehousing and OLAP (DOLAP ’01), Georgia, USA, pp. 40–47. ACM Press (2001) Google Scholar
  15. 15.
    Gupta, A., Mumick, I.S.: Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng. Bull. 18(2), 3–18 (1995) Google Scholar
  16. 16.
    Hammer, J., Garcia-Molina, H., Widom, J., Labio, W., Zhuge, Y.: The stanford data warehousing project. IEEE Data Eng. Bull. 18(2), 40–47 (1995) Google Scholar
  17. 17.
    Huang, M.S., Su, H.C.: The development of an XML-based data warehouse system. In: Proceedings of the 3rd International Conference on Intelligent Data Engineering and Automated Learning, pp. 206–212. Springer-Verlag, Berlin/Heidelberg (2002) CrossRefGoogle Scholar
  18. 18.
    Huynh, N., Mangisengi, O., Tjoa, M.A.: Metadata for object relational data warehouse. In: Proceedings of the Second Intl. Workshop on Design and Management of Data Warehouses (DMDW ’00), 2000, pp. 3-1–3-9 Google Scholar
  19. 19.
    Hummer, W., Bauer, A., Harde, G.: XCube—XML for data warehouses. In: ACM 6th International Workshop on Data Warehousing and OLAP (DOLAP ’03), 2003, pp. 33–44 Google Scholar
  20. 20.
    Jensen, M., Moller, T., Pedersen, T.: Specifying OLAP cubes on XML data. J. Int. Inf. Syst. 17(3), 101–112 (2001) Google Scholar
  21. 21.
    Loney, K., Koch, G.: Oracle 9i: The Complete Reference, Osborne. McGraw-Hill, Berkeley (2000) Google Scholar
  22. 22.
    Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (ACM PODS ’02), 2002, pp. 233–246 Google Scholar
  23. 23.
    Li, S., Liu, M., Wang, G., Peng, Z.: Capturing semantic hierarchies to perform meaningful integration in HTML tables. In: Proceedings of the 6th Asia-Pacific Web Conference on Advanced Web Technologies and Applications (APWeb ‘04), 2004, pp. 899–902 Google Scholar
  24. 24.
    Melton, J. (ed.): Information technology—database languages—SQL—Part 14: XML-related specifications (SQL/XML). ISO/IEC 9075-14 (2003) Google Scholar
  25. 25.
    Mohamah, S., Rahayu, W., Dillon, T.: Object relational star schemas. In: Proceeding of the 13th International Conference on Parallel and Distributed Computing and Systems (PDCS ’01), Anaheim, California. ACTA Press (2001) Google Scholar
  26. 26.
    Miller, L.L., Honavar, V., Wong, J., Nilakanta, S.: Object-oriented data warehouse for information fusion from heterogeneous distributed data and knowledge sources. In: IEEE Information Technology, 1998, pp. 27–30 Google Scholar
  27. 27.
    Nassis, V., Rahayu, W., Rajugan, R., Dillon, T.: Conceptual design of XML document warehouses. In: Proceeding of the 6th International on Data Warehousing and Knowledge Discovery (DaWak, ‘04), 2004, pp. 1–14 Google Scholar
  28. 28.
    Nassis, V., Rajagopalapillai, R., Dillon, S.T., Rahayu, W.: Conceptual and systematic design approach for XML document warehouses Int. J. Data Warehous. Min. 1(3), 63–87 (2005) Google Scholar
  29. 29.
    Nummenmaa, J., Niemi, T., Niinimäki, M., Thanisch, P.: Constructing an OLAP cube on XML data. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP (DOLAP ’02), 2002, pp. 22–27 Google Scholar
  30. 30.
    Pardede, E., Rahayu, W.J., Taniar, D.: On using collection for aggregation and association relationships in XML object relational storage. In: Proceedings of the 2004 ACM Symposium on Applied Computing (SAC ’04), pp. 703–710. ACM Press, New York (2004) CrossRefGoogle Scholar
  31. 31.
    Pardede, E., Rahayu, W.J., Taniar, D.: Preserving conceptual constraints during XML updates. Int. J. Web Inf. Syst. 1(2), 65–82 (2005) CrossRefGoogle Scholar
  32. 32.
    Rahayu, W.J., Chang, E., Dillon, S.T., Taniar, D.: A methodology of transforming inheritance relationships in an object-oriented conceptual model to relational tables. Inf. Softw. Technol. J. 42(8), 571–592 (2000) CrossRefGoogle Scholar
  33. 33.
    Rahayu, J.W.: Object relational transformation. PhD Thesis of Computer Science and Computer Engineering, La Trobe University, Melbourne (1999) Google Scholar
  34. 34.
    Rusu, I.L., Rahayu, W.J., Taniar, D.: On building XML data warehouses. In: Intelligent Data Engineering and Automated Learning, (IDEAL), LNCS vol. 3177/2004, pp. 293–299. Springer-Verlag, Berlin/Heidelberg (2004) Google Scholar
  35. 35.
    Rusu, I.L., Rahayu, W.J., Taniar, D.: Methodology for building XML data warehouses. Int. J. Data Warehous. Min. 1(2), 23–48 (2005) Google Scholar
  36. 36.
    Serrano, M., Calero, C., Piattini, M.: An experimental replication with data warehouse metrics. Int. J. Data Warehous. Min. 1(4), 1–21 (2005) Google Scholar
  37. 37.
    Taniar, D., Rahayu, W., Srivastava, P.: A taxonomy for object-relational queries, effective database for text & document management. In: Becker, S.A. (ed.) Effective Database For Text and Document Management, pp. 183–220. IDEA Group Publishing, USA (2003) Google Scholar
  38. 38.
    Widom, J.: Research problem in data warehouse. In: Proceedings of the 4th International Conference on Information and Knowledge Management, 1995, pp. 25–30 Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringLa Trobe UniversityBundooraAustralia
  2. 2.Clayton School of Information TechnologyMonash UniversityClaytonAustralia

Personalised recommendations