Advertisement

Data Warehouses: Next Challenges

  • Alejandro Vaisman
  • Esteban Zimányi
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 96)

Summary

Data Warehouses are a fundamental component of today’s Business Intelligence infrastructure. They allow to consolidate heterogeneous data from distributed data stores and transform it into strategic indicators for decision making. In this tutorial we give an overview of current state of the art and point out to next challenges in the area. In particular, this includes to cope with more complex data, both in structure and semantics, and keeping up with the demands of new application domains such as Web, financial, manufacturing, genomic, biological, life science, multimedia, spatial, and spatiotemporal applications. We review consolidated resaerch in spatio-temporal databases, and open research fields, like real-time Business Intelligence and Semantic Web Data Warehousing and OLAP.

Keywords

data warehouses OLAP spatiotemporal data warehouses realtime data warehouses semantic data warehouses 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kimball, R.: The Data Warehouse Toolkit. J. Wiley and Sons (1996)Google Scholar
  2. 2.
    Cabibbo, L., Torlone, R.: Querying Multidimensional Databases. In: Cluet, S., Hull, R. (eds.) DBPL 1997. LNCS, vol. 1369, pp. 253–269. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  3. 3.
    Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: Proceedings of SIGMOD, Montreal, Canada, pp. 205–216 (1996)Google Scholar
  4. 4.
    Stonebraker, M.: Stonebraker on data warehouses. Commun. ACM 54(5), 10–11 (2011)CrossRefGoogle Scholar
  5. 5.
    Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)CrossRefGoogle Scholar
  6. 6.
    Stonebraker, M., Abadi, D.J., DeWitt, D.J., Madden, S., Paulson, E., Pavlo, A., Rasin, A.: MapReduce and parallel DBMSs: friends or foes? Commun. ACM 53(1), 64–71 (2010)CrossRefGoogle Scholar
  7. 7.
    Bajda-Pawlikowski, K., Abadi, D.J., Silberschatz, A., Paulson, E.: Efficient processing of data warehousing queries in a split execution environment. In: Proceedings of SIGMOD, pp. 1165–1176 (2011)Google Scholar
  8. 8.
    Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, N.Z.S., Liu, H., Murthy, R.: Hive: a petabyte scale data warehouse using Hadoop. In: Proceedings of ICDE, pp. 996–1005 (2010)Google Scholar
  9. 9.
    Thusoo, A., Shao, Z., Anthony, S., Borthakur, D., Jain, N., Sarma, J.S., Murthy, R., Liu, H.: Data warehousing and analytics infrastructure at facebook. In: Proceedings of SIGMOD, pp. 1013–1020 (2010)Google Scholar
  10. 10.
    Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: Mad skills: New analysis practices for big data. PVLDB 2(2), 1481–1492 (2009)Google Scholar
  11. 11.
    Lassila, O., Swick, R.R. (eds.): Resource description framework (RDF) model and syntax specification. W3C Recommendation (1999)Google Scholar
  12. 12.
    Worboys, M.F.: GIS: A Computing Perspective. Taylor & Francis (1995)Google Scholar
  13. 13.
    Rivest, S., Bédard, Y., Marchand, P.: Toward better support for spatial decision making: Defining the characteristics of spatial on-line analytical processing (SOLAP). Geomatica 55(4), 539–555 (2001)Google Scholar
  14. 14.
    Shekhar, S., Lu, C., Tan, X., Chawla, S., Vatsavai, R.R.: Mapcube: A visualization tool for spatial data warehouses. In: Miller, H.J., Han, J. (eds.) Geographic Data Mining and Knowledge Discovery, pp. 74–109. Taylor & Francis (2001)Google Scholar
  15. 15.
    Gómez, L., Haesevoets, S., Kuijpers, B., Vaisman, A.A.: Spatial aggregation: Data model and implementation. CoRR abs/0707.4304 (2007)Google Scholar
  16. 16.
    Vaisman, A., Zimányi, E.: What is Spatio-Temporal Data Warehousing? In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 9–23. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  17. 17.
    Eder, J., Koncilia, C., Morzy, T.: The COMET Metamodel for Temporal Data Warehouses. In: Pidduck, A.B., Mylopoulos, J., Woo, C.C., Ozsu, M.T. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 83–99. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Mendelzon, A.O., Vaisman, A.A.: Temporal queries in OLAP. In: Proceedings of VLDB, Cairo, Egypt, pp. 242–253 (2000)Google Scholar
  19. 19.
    Klug, A.: Equivalence of relational algebra and relational calculus query languages having aggregate functions. Journal of ACM (1982) 699–717Google Scholar
  20. 20.
    Malinowski, E., Zimányi, E.: Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications. Springer, Heidelberg (2008)Google Scholar
  21. 21.
    Güting, R.H., Böhlen, M.H., Erwig, M., Jensen, C.S., Lorentzos, N.A., Schneider, M., Vazirgiannis, M.: A foundation for representing and quering moving objects. ACM Trans. Database Syst. 25(1), 1–42 (2000)CrossRefGoogle Scholar
  22. 22.
    Güting, R.H., Schneider, M.: Moving Objects Databases. Morgan Kaufmann (2005)Google Scholar
  23. 23.
    Orlando, S., Orsini, R., Raffaetà, A., Roncato, A., Silvestri, C.: Spatio-Temporal Aggregations in Trajectory Data Warehouses. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 66–77. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  24. 24.
    Damiani, M.L., Vangenot, C., Frentzos, E., Marketos, G., Theodoridis, Y., Veryklos, V., Raffaetà, A.: Design of the trajectory warehouse architecture. Technical Report D1.3, GeoPKDD project (2007)Google Scholar
  25. 25.
    Raffaetà, A., Leonardi, L., Marketos, G., Andrienko, G.L., Andrienko, N.V., Frentzos, E., Giatrakos, N., Orlando, S., Pelekis, N., Roncato, A., Silvestri, C.: Visual mobility analysis using t-warehouse. International Journal of Data Warehouse and Mining 7(1), 1–23 (2011)CrossRefGoogle Scholar
  26. 26.
    Marketos, G., Theodoridis, Y.: Ad-hoc OLAP on trajectory data. In: Proceedings of MDM, pp. 189–198 (2010)Google Scholar
  27. 27.
    Paolino, L., Tortora, G., Sebillo, M., Vitiello, G., Laurini, R.: Phenomena: a visual query language for continuous fields. In: Proceedings of ACM-GIS, pp. 147–153 (2003)Google Scholar
  28. 28.
    Tomlin, C.D.: Geographic Information Systems and Cartographic Modelling. Prentice-Hall (1990)Google Scholar
  29. 29.
    Câmara, G., Palomo, D., de Souza, R.C.M., de Oliveira, O.R.F.: Towards a generalized map algebra: Principles and data types. In: Proceedings of GeoInfo., pp. 66–81 (2005)Google Scholar
  30. 30.
    Cordeiro, J.P., Câmara, G., Moura, U.F., Barbosa, C.C., Almeida, F.: Algebraic formalism over maps. In: Proceedings of GeoInfo., pp. 49–65 (2005)Google Scholar
  31. 31.
    Mennis, J., Viger, R., Tomlin, C.D.: Cubic map algebra functions for spatio-temporal analysis. Cartography and Geographic Information Science 32(1), 17–32 (2005)CrossRefGoogle Scholar
  32. 32.
    Vaisman, A.A., Zimányi, E.: A multidimensional model representing continuous fields in spatial data warehouses. In: Proceedings of ACM-GIS, pp. 168–177 (2009)Google Scholar
  33. 33.
    Gómez, L., Vaisman, A., Zimányi, E.: Physical Design And Implementation of Spatial Data Warehouses Supporting Continuous Fields. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 25–39. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  34. 34.
    Ahmed, T.O., Miquel, M.: Multidimensional Structures Dedicated to Continuous Spatiotemporal Phenomena. In: Jackson, M., Nelson, D., Stirk, S. (eds.) BNCOD 2005. LNCS, vol. 3567, pp. 29–40. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  35. 35.
    Kumler, M.P.: An intensive comparison of triangulated irregular networks (TINs) and digital elevation models (DEMs). Cartographica 31(2), 1–99 (1994)CrossRefGoogle Scholar
  36. 36.
    Ledoux, H., Gold, C.: A Voronoi-based map algebra. In: Riedl, A., Kainz, W., Elmes, G.A. (eds.) Progress in Spatial Data Handling, pp. 117–131. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  37. 37.
    Bruckner, R.M., List, B., Schiefer, J.: Striving Towards Near Real-Time Data Integration for Data Warehouses. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 317–326. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  38. 38.
    Schneider, D.A.: Practical Considerations for Real-Time Business Intelligence. In: Bussler, C.J., Castellanos, M., Dayal, U., Navathe, S. (eds.) BIRTE 2006. LNCS, vol. 4365, pp. 1–3. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  39. 39.
    Simitsis, A., Vassiliadis, P., Sellis, T.K.: Optimizing ETL processes in data warehouses. In: Proceedings of ICDE, pp. 564–575 (2005)Google Scholar
  40. 40.
    Zhu, Y., An, L., Liu, S.: Data updating and query in real-time data warehouse system. In: Proceedings of CSSE, pp. 1295–1297. IEEE Computer Society, Washington, DC, USA (2008)Google Scholar
  41. 41.
    Kimball, R., Ross, M.: The Kimball Group Reader: Relentlessly Practical Tools for Data Warehouse and Business Intelligence. J. Wiley and Sons (2010)Google Scholar
  42. 42.
    Vandemay, J.: Considerations for building a real-time data warehouse. Technical Report DMC (White Paper), Data Mirror Corporation (2001)Google Scholar
  43. 43.
    Thomsen, C.S., Pedersen, T.B., Lehner, W.: RiTE: Providing on-demand data for right-time data warehousing. In: Proceedings of ICDE, pp. 456–465. IEEE Computer Society, Washington, DC, USA (2008)Google Scholar
  44. 44.
    Hammer, J., Schneider, M., Sellis, T.: Data warehousing at the crossroads. Technical Report 04321, Dagsthul Seminar (2004)Google Scholar
  45. 45.
    Pérez, J.M., Llavori, R.B., Aramburu, M.J., Pedersen, T.B.: Integrating data warehouses with web data: A survey. IEEE Trans. Knowl. Data Eng. 20(7), 940–955 (2008)CrossRefGoogle Scholar
  46. 46.
    Niinimäki, M., Niemi, T.: An ETL process for OLAP using RDF/OWL ontologies. Journal on Data Semantics 13, 97–119 (2009)CrossRefGoogle Scholar
  47. 47.
    Romero, O., Abelló, A.: Automating multidimensional design from ontologies. In: Proceedings of DOLAP, pp. 1–8 (2007)Google Scholar
  48. 48.
    Nebot, V., Llavori, R.B.: Building data warehouses with semantic data. In: Proceedings of EDBT/ICDT Workshops (2010)Google Scholar
  49. 49.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  50. 50.
    Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads. PVLDB 2(1), 922–933 (2009)Google Scholar
  51. 51.
    Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: Proceedings of EDBT, pp. 99–110 (2010)Google Scholar
  52. 52.
    Sridhar, R., Ravindra, P., Anyanwu, K.: RAPID: Enabling Scalable Ad-Hoc Analytics on the Semantic Web. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 715–730. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  53. 53.
    Chatziantoniou, D., Akinde, M.O., Johnson, T., Kim, S.: The MD-join: An operator for complex OLAP. In: Proceedings of ICDE, pp. 524–533 (2001)Google Scholar
  54. 54.
    Ravindra, P., Deshpande, V.V., Anyanwu, K.: Towards scalable RDF graph analytics on MapReduce. In: Proceedings of MDAC, vol. 5, pp. 1–5 (2010)Google Scholar
  55. 55.
    Jin, X., Han, J., Cao, L., Luo, J., Ding, B., Lin, C.X.: Visual cube and on-line analytical processing of images. In: Proceedings of CIKM, pp. 849–858 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alejandro Vaisman
    • 1
  • Esteban Zimányi
    • 1
  1. 1.Department of Computer and Decision Engineering (CoDE)Université Libre de BruxellesBelgium

Personalised recommendations