Skip to main content
Log in

On-demand multidimensional data integration: toward a semantic foundation for cloud intelligence

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Cloud intelligence is a collection of technologies emerging from the migration of business intelligence and analytics technologies to a cloud computing environment combined with exploiting the massive range of new intelligence opportunities opened up by cloud computing. Cloud computing introduces several trends which require traditional business intelligence techniques to be re-thought, including agility, the ability to assemble resources, e.g., data sources, on-demand, and virtualization, e.g., that data are provided as a service over the web rather than stored in local databases. This paper focuses on the combination of data source agility and data-as-a-service virtualization and its use for cloud intelligence. After presenting the novel vision of the Cloud Warehouse, the paper goes on to present a comprehensive semantic foundation for on-demand multidimensional data integration, including formal data models, a range of query operators and re-write rules for optimization. This semantic foundation provides a sound formal basis for on-demand multidimensional data integration, which is a cornerstone of cloud intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abadi DJ (2009) Data management in the cloud: limitations and opportunities. IEEE Data Eng Bull 32(1):3–12. Special issue on cloud data management

    Google Scholar 

  2. Body M, Miquel M, Bédard Y, Tchounikine A (2003) Handling evolutions in multidimensional structures. In: Proceedings of ICDE, pp 581–591

    Google Scholar 

  3. Agrawal R, Gupta A, Sarawagi S (1997) Modeling multidimensional databases. In: Proceedings of the thirteenth international conference on data engineering, pp 232–243

    Google Scholar 

  4. Beyer KS, Ercegovac V, Krishnamurthy R, Raghavan S, Rao J, Reiss F, Shekita EJ, Simmen DE, Tata S, Vaithyanathan S, Zhu H (2009) Towards a scalable enterprise content analytics platform. IEEE Data Eng Bull 32(1):28–35. Special issue on cloud data management

    Google Scholar 

  5. Christophides V, Cluet S, Simeon J (2000) On wrapping query languages and efficient XML integration. In: Proceedings of the ACM SIGMOD conference, pp 141–152

    Google Scholar 

  6. Du W, Krishnamurthy R, Shan M-C (1992) Query optimization in a heterogeneous DBMS. In: Proceedings of VLDB, pp 277–291

    Google Scholar 

  7. ECIX Quickdata Architecture. archives.si2.org/si2_publications// Current as of April 15, 2010

  8. Eder J, Koncilia C (2001) Evolution of dimension data in temporal data warehouses. In: Proceedings of DaWaK, pp 284–293

    Google Scholar 

  9. Franklin MJ, Halevy AY, Maier D (2008) A first tutorial on dataspaces. In: Proceedings of the VLDB endowment, vol 1, pp 1516–1517

    Google Scholar 

  10. Garcia-Molina H et al (1997) The TSIMMIS approach to mediation: data models and languages. J Intell Inf Syst 8(2):117–132

    Article  Google Scholar 

  11. Gingras F, Lakshmanan LVS (1998) nD-SQL: A multi-dimensional language for interoperability and OLAP. In: Proceedings of 24rd international conference on very large data bases, pp 134–145

    Google Scholar 

  12. Goldman R, Widom J (2000) WSQ/DSQ: A practical approach for combined querying of databases and the web. In: Proceedings of the ACM SIGMOD conference, pp 285–296

    Google Scholar 

  13. Gray J et al (1996) Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: Proceedings of the twelfth international conference on data engineering, pp 152–159

    Chapter  Google Scholar 

  14. Grefen PWPJ, de By RA (1994) A multi-set extended relational algebra—a formal approach to a practical issue. In: Proceedings of ICDE, pp 80–88

    Google Scholar 

  15. Gupta A, Harinarayan V, Quass D (1995) Aggregate-query processing in data warehousing environments. In: Proceedings of 21th international conference on very large data bases, pp 358–369

    Google Scholar 

  16. Haas LM, Kossmann S, Wimmers EL, Yang J (1997) Optimizing queries across diverse data sources. In: Proceedings of VLDB, pp 276–285

    Google Scholar 

  17. Hellerstein JM et al (1999) Independent, open enterprise data integration. IEEE Data Eng Bull 22(1):43–49

    Google Scholar 

  18. Hurtado C, Mendelzon A, Vaisman A (1999) Maintaining data cubes under dimension updates. In: Proceedings of ICDE, pp 346–355

    Google Scholar 

  19. Kang H-G, Chung C-W (2002) Exploiting versions for on-line data warehouse maintenance in MOLAP servers. In: Proceedings of VLDB, pp 742–753

    Google Scholar 

  20. Lehner W (1998) Modelling large scale OLAP scenarios. In: Proceedings of the sixth international conference on extending database technology, pp 153–167

    Google Scholar 

  21. Lenz H-J, Shoshani A (1997) Summarizability in OLAP and statistical databases. In: Proceedings of the ninth international conference on statistical and scientific database management, pp 39–48

    Google Scholar 

  22. Pedersen TB, Pedersen D, Pedersen J (2008) Integrating XML data in the TARGIT OLAP system. Int J Web Eng Technol 4(4):495–533

    Article  Google Scholar 

  23. Pedersen D, Riis K, Pedersen TB (2002) A powerful and SQL-compatible data model and query language for OLAP. In: Proceedings of the thirteenth Australasian database conference, pp 121–130

    Google Scholar 

  24. Pedersen D, Riis K, Pedersen TB (2002) Query optimization for OLAP-XML federations. In: Proceedings of DOLAP, pp 57–64

    Google Scholar 

  25. Pedersen D, Riis K, Pedersen TB (2002) Cost modeling and estimation for OLAP-XML federations. In: Proceedings of the fourth international conference on data warehousing and knowledge discovery

    Google Scholar 

  26. Pedersen D, Riis K, Pedersen TB (2002) XML-extended OLAP querying. In: Proceedings of the fourteenth international conference on scientific and statistical database management, pp 195–206

    Chapter  Google Scholar 

  27. Pedersen D, Pedersen TB, Riis K (2004) The decoration operator: a foundation for on-line dimensional data integration. In: Proceedings of the eighth international database engineering and application symposium.

    Google Scholar 

  28. Pedersen TB (2009) Warehousing the world: a vision for data warehouse research. In: Annals of Information Systems, vol 3

    Google Scholar 

  29. Pedersen TB (2010) Research challenges for cloud intelligence: invited keynote talk. In: Proceedings of the international workshop on business intelligence on the web, part of the EDBT/ICDT workshops, p 2010

    Google Scholar 

  30. Pedersen TB, Jensen CS (1998) Multidimensional data modeling for complex data. In: Proceedings of the fifteenth international conference on data engineering, pp 336–345

    Google Scholar 

  31. Pedersen TB, Jensen CS, Dyreson CE (1999) Extending practical pre-aggregation in on-line analytical processing. In: Proceedings of twenty-fifth international conference on very large data bases, pp 663–674

    Google Scholar 

  32. Pedersen TB et al (2000) Extending OLAP querying to external object databases. In: Proceedings of the ninth international conference on information and knowledge management, pp 405–413

    Google Scholar 

  33. Pérez JM, Berlanga Llavori R, Aramburu MJ, Pedersen TB (2008) Integrating data warehouses with web data: a survey. In: IEEE transactions on knowledge and data engineering, pp 940–955

    Google Scholar 

  34. Rafanelli M, Ricci FL (1983) Proposal of a logical model for statistical databases. In: Proceedings of the second international workshop on statistical database management, pp 264–272

    Google Scholar 

  35. Rafanelli M, Shoshani A (1990) STORM: a statistical object representation model. In: Proceedings of the fifth international conference on statistical and scientific database management, pp 14–29

    Chapter  Google Scholar 

  36. Roth MT et al (1996) The Garlic project. In: Proceedings of SIGMOD, p 557

    Google Scholar 

  37. Sarma AS, Dong E, Halevy AY (2008) Bootstrapping pay-as-you-go data integration systems. In: Proceedings of SIGMOD, pp 861–874

    Chapter  Google Scholar 

  38. Sheth AP, Larson JA (1990) Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comput Surv 22(3):183–236

    Article  Google Scholar 

  39. Thomsen E (1997) OLAP solutions: building multidimensional information Systems. Wiley, New York

    Google Scholar 

  40. Thomsen E, Spofford G, Chase D (1999) Microsoft OLAP solutions. Wiley, New York

    Google Scholar 

  41. W3C. Extensible markup language (xml) 1.0. www.w3.org/TR/REC-xml. Current as of April 15, 2010

  42. W3C. Xml path language (xpath) version 1.0. www.w3.org/TR/xpath. Current as of April 15, 2010

  43. W3C. Xml schema part 0: Primer. www.w3.org/TR/xmlschema-0/. Current as of April 15, 2010

  44. W3C. XQuery 1.0: An XML Query Language. www.w3.org/TR/xquery. Current as of April 15, 2010

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Torben Bach Pedersen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pedersen, T.B., Pedersen, D. & Riis, K. On-demand multidimensional data integration: toward a semantic foundation for cloud intelligence. J Supercomput 65, 217–257 (2013). https://doi.org/10.1007/s11227-011-0712-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-011-0712-3

Keywords

Navigation