Advertisement

How Can We Implement a Multidimensional Data Warehouse Using NoSQL?

  • Max Chevalier
  • Mohammed El Malki
  • Arlind Kopliku
  • Olivier Teste
  • Ronan Tournier
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 241)

Abstract

The traditional OLAP (On-Line Analytical Processing) systems store data in relational databases. Unfortunately, it is difficult to manage big data volumes with such systems. As an alternative, NoSQL systems (Not-only SQL) provide scalability and flexibility for an OLAP system. We define a set of rules to map star schemas and its optimization structure, a precomputed aggregate lattice, into two logical NoSQL models: column-oriented and document-oriented. Using these rules we analyse and implement two decision support systems, one for each model (using MongoDB and HBase).We compare both systems during the phases of data (generated using the TPC-DS benchmark) loading, lattice generation and querying.

Keywords

NoSQL OLAP Aggregate lattice Column-oriented Document-oriented 

Notes

Acknowledgements

This work is supported by the ANRT funding under CIFRE-Capgemini partnership.

References

  1. 1.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008). ACMCrossRefGoogle Scholar
  2. 2.
    Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Rec. 26, 65–74 (1997)CrossRefGoogle Scholar
  3. 3.
    El Malki, M., Teste, O., Kopliku, A., Chevalier, M., Tournier, R.: Implementation of multidimensional databases with document-oriented NoSQL. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 379–390. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  4. 4.
    Kopliku, A., Chevalier, M., Malki, M.E., Teste, O., Tournier, R.: Implementation of multidimensional databases in column-oriented NoSQL Systems. In: Morzy, T., Valduriez, P., Ladjel, B. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 79–91. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  5. 5.
    Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Benchmark for OLAP on NoSQL technologies. In: IEEE International Conference on Research Challenges in Information Systems (RCIS), pp. 480–485. IEEE (2015)Google Scholar
  6. 6.
    Chevalier, M., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementing multidimensional data warehouses into NoSQL. In: 17th International Conference on Enterprise Information Systems (ICEIS), vol. 1, pp. 172–183. SciTePress (2015)Google Scholar
  7. 7.
    Colliat, G.: Olap, relational, and multidimensional database systems. ACM SIGMOD Rec. 25(3), 64–69 (1996)CrossRefGoogle Scholar
  8. 8.
    Cuzzocrea, A., Bellatreche, L., Song, I.-Y.: Data warehousing and OLAP over big data: Current challenges and future research directions. In: 16th International Workshop on Data Warehousing and OLAP (DOLAP), pp. 67–70. ACM (2013)Google Scholar
  9. 9.
    Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance evaluation of a MongoDB and hadoop platform for scientific data analysis. In: 4th Workshop on Scientific Cloud Computing, pp. 13–20. ACM (2013)Google Scholar
  10. 10.
    Dehdouh, K., Boussaid, O., Bentayed, F., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: 21st International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 469–475 (2015)Google Scholar
  11. 11.
    Bentayeb, F., Boussaid, O., Kabachi, N., Dehdouh, K.: Towards an OLAP environment for column-oriented data warehouses. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 221–232. Springer, Heidelberg (2014)Google Scholar
  12. 12.
    Bentayeb, F., Dehdouh, K., Boussaid, O.: Columnar NoSQL star schema benchmark. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp. 281–288. Springer, Heidelberg (2014)Google Scholar
  13. 13.
    Floratou, A., Teletia, N., Dewitt, D., Patel, J., Zhang, D.: Can the elephants handle the NoSQL onslaught? In: International Conference on Very Large Data Bases (VLDB) 5(12), 1712–1723. VLDB Endowment (2012)Google Scholar
  14. 14.
    Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: A conceptual model for data warehouses. Int. J. Coop. Inf. Syst. (IJCIS) 7(2–3), 215–247 (1998)CrossRefGoogle Scholar
  15. 15.
    Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International Conference on Data Engineering (ICDE), pp. 152–159. IEEE Computer Society (1996)Google Scholar
  16. 16.
    Han, D., Stroulia, E.: A three-dimensional data model in Hbase for large time-series dataset analysis. In: 6th International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA), pp. 47–56. IEEE (2012)Google Scholar
  17. 17.
    Jacobs, A.: The pathologies of big data. Commun. ACM 52(8), 36–44 (2009)CrossRefGoogle Scholar
  18. 18.
    Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, Indianapolis (2013)Google Scholar
  19. 19.
    Kim, J., Moon, Y.-S., Lee, S., Lee, W.: Efficient distributed parallel top-down computation of R-OLAP data cube using mapreduce. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 168–179. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  20. 20.
    LeFevre, J., Sankaranarayanan, J., Hacigumus, H., Tatemura, J., Polyzotis, N., Carey, M.J.: MISO: souping up big data query processing with a multistore system. In: International Conference on Management of data (SIGMOD), pp. 1591–1602. ACM (2014)Google Scholar
  21. 21.
    Li, C.: Transforming relational database into Hbase: A case study. In: International Conference on Software Engineering and Service Sciences (ICSESS), pp. 683–687. IEEE (2010)Google Scholar
  22. 22.
    Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. Data Knowl. Eng. (DKE) 59(2), 348–377 (2006). ElsevierCrossRefGoogle Scholar
  23. 23.
    Morfonios, K., Konakas, S., Ioannidis, Y., Kotsis, N.: R-OLAP implementations of the data cube. ACM Comput. Surv. 39(4), 12 (2007). ACMCrossRefGoogle Scholar
  24. 24.
    Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: International Conference on Management of data (SIGMOD), pp. 165–178. ACM (2009)Google Scholar
  25. 25.
    Ravat, F., Teste, O., Tournier, R., Zurfluh, G.: Algebraic and Graphic Languages for OLAP Manipulations. Int. J. Data Warehouse. Min. (IJDWM) 4(1), 17–46 (2008). IGI PublishingCrossRefGoogle Scholar
  26. 26.
    Simitsis, A., Vassiliadis, P., Sellis, T.: Optimizing ETL processes in data warehouses. In: International Conference on Data Engineering (ICDE), pp. 564–575. IEEE (2005)Google Scholar
  27. 27.
    Stonebraker, M.: New opportunities for new SQL. Commun. ACM 55(11), 10–11 (2012)CrossRefGoogle Scholar
  28. 28.
    Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era: (it’s time for a complete rewrite). In: 33rd International Conference on Very large Data Bases (VLDB), pp. 1150–1160. ACM (2007)Google Scholar
  29. 29.
    Strozzi, C.: NoSQL – A relational database management system (2007–2010). http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/Home%20Page
  30. 30.
    Vajk, T., Feher, P., Fekete, K., Charaf, H.: Denormalizing data into schema-free databases. In: 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 747–752. IEEE (2013)Google Scholar
  31. 31.
    Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N.: ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Engineering Bulletin, 23(4), IEEE, pp. 42–47, 2000Google Scholar
  32. 32.
    Tahara, D., Diamond, T., Abadi, D.J.: Sinew: a SQL system for multi-structured data. In: International Conference on Management of data (SIGMOD), pp. 815–826. ACM (2014)Google Scholar
  33. 33.
    TPC-DS. Transaction Processing Performance Council, Decision Support benchmark, version 1.3.0 (2014). http://www.tpc.org/tpcds/
  34. 34.
    Wrembel, R.: A survey of managing the evolution of data warehouses. Int. J. Data Warehouse. Min. (IJDWM) 5(2), 24–56 (2009). IGI PublishingCrossRefGoogle Scholar
  35. 35.
    Zhao, H., Ye, X.: A practice of TPC-DS multidimensional implementation on NoSQL database systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 93–108. Springer, Heidelberg (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Max Chevalier
    • 1
  • Mohammed El Malki
    • 1
    • 2
  • Arlind Kopliku
    • 1
  • Olivier Teste
    • 1
  • Ronan Tournier
    • 1
  1. 1.Université de Toulouse, IRIT, UMR 5505ToulouseFrance
  2. 2.CapgeminiToulouseFrance

Personalised recommendations