Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Parallel and Distributed Data Warehouses

  • Ladjel Bellatreche
  • Todd Davis
  • Belayadi Djahida
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_261

Synonyms

High performance data warehousing; Scalable decision support systems

Definition

With the era of Big Data, we are facing a data deluge (http://www.economist.com/node/15579717). Multiple data providers are contributing to this deluge. We can cite three main examples : (i) the massive use of sensors (e.g. 10 Terabyte of data are generated by planes every 30 min), (ii) the massive use of social networks (e.g., 340 million tweets per day), (iii) transactions (Walmart handles more than one million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data). The decision makers need fast response time to their requests in order to predict in real time the behavior of users, so they can offer them services via analyzing large volumes of data. The data warehouse (\(\mathcal {DW}\)

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Agrawal S, Narasayya VR, Yang B. Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 359–70.Google Scholar
  2. 2.
    Akal F, Böhm K, Schek HJ. OLAP query evaluation in a database cluster: a performance study on intra-query parallelism. In: Proceedings of the 6th East European Conference on Advances in Databases and Information Systems; 2002. p. 218–31.CrossRefGoogle Scholar
  3. 3.
    Apers PMG. Data allocation in distributed database systems. ACM Trans Database Syst. 1988;13(3): 263–304.CrossRefGoogle Scholar
  4. 4.
    Bellatreche L, Benkrid S, Ghazal A, Crolotte A, Cuzzocrea A. Verification of partitioning and allocation techniques on teradata DBMS. In: Proceedings of the 11th International Conference on Algorithms and Architectures for Parallel Processing; 2011. p. 158–69.CrossRefGoogle Scholar
  5. 5.
    Bellatreche L, Boukhalfa K, Richard P. Referential horizontal partitioning selection problem in data warehouses: hardness study and selection algorithms. Int J Data Warehouse Min. 2009;5(4):1–23.CrossRefGoogle Scholar
  6. 6.
    Bellatreche L, Cuzzocrea A, Benkrid S. Effectively and efficiently designing and querying parallel relational data warehouses on heterogeneous database clusters: the F&A approach. J Database Manag. 2012;23(4):17–51.CrossRefGoogle Scholar
  7. 7.
    Bellatreche L, Cuzzocrea A, Benkrid S. A global paradigm for designing parallel relational data warehouses in distributed environments. Trans Large-Scale Data-Knowl-Cent Syst J. 2014;XV:1–38 (To appear).Google Scholar
  8. 8.
    Bellatreche L, Karlapalem K, Mohania MK, Schneider M. What can partitioning do for your data warehouses and data marts? In: Proceedings of the International Symposium on Database Engineering and Applications; 2000. p. 437–46.Google Scholar
  9. 9.
    Bellatreche L, Karlapalem K, Simonet A. Algorithms and support for horizontal class partitioning in object-oriented databases. Distrib Parallel Databases J. 2000;8(2):155–79.CrossRefGoogle Scholar
  10. 10.
    Ceri S, Negri M, Pelagatti G. Horizontal data partitioning in database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1982. p. 128–36.Google Scholar
  11. 11.
    Dehne F, Eavis T, Rau-Chaplin A. The cgmCUBE Project: optimizing parallel data cube generation for ROLAP. J Distrib Parallel Databases. 2006;19(1): 29–62.CrossRefGoogle Scholar
  12. 12.
    DeWitt D, Ghandeharizadeh S, Schneider D, Bricker A, Hsaio H, Rasmussen R. The Gamma database machine project. Trans Knowl Data Eng. 1990;2(1):44–62.CrossRefGoogle Scholar
  13. 13.
    DeWitt D, Gray J. Parallel database systems: the future of high performance database systems. Commun ACM. 1992;35(6):85–98.CrossRefGoogle Scholar
  14. 14.
    Furtado C, Lima A, Pacitti E, Valduriez P, Mattoso M. Physical and virtual partitioning in OLAP database clusters. In: Proceedings of the International Symposium on Computer Architecture and High Performance Computing; 2005. p. 143–50.Google Scholar
  15. 15.
    Goil S, Choudhary A. High performance multidimensional analysis of large datasets. In: Proceedings of the 1st ACM International Workshop on Data Warehousing and OLAP; 1998. p. 34–9.Google Scholar
  16. 16.
    Jin R, Vaidyanathan K, Yang G, Agrawal G. Communication and memory optimal parallel data cube construction. Trans Parallel Distrib Syst. 2005;16(12):1105–19.CrossRefGoogle Scholar
  17. 17.
    Karlapalem K, Li Q. A framework for class partitioning in object-oriented databases. Distrib Parallel Databases J. 2000;8(3):333–66.CrossRefGoogle Scholar
  18. 18.
    Lima AB, Furtado C, Valduriez P, Mattoso M. Parallel OLAP query processing in database clusters with data replication. distributed and parallel databases. Distrib Parallel Database J. 2009;25(1–2):97–123.CrossRefGoogle Scholar
  19. 19.
    Noaman AY, Barker K. A horizontal fragmentation algorithm for the fact relation in a distributed data warehouse. In: Proceedings of the 8th International Conference on Information and Knowledge Management; 1999. p. 154–61.Google Scholar
  20. 20.
    Özsu MT, Valduriez P. Principles of distributed database systems, 2nd ed. Upper Saddle River: Prentice-Hall; 1999.Google Scholar
  21. 21.
    Rao J, Zhang C, Lohman G, Megiddo N. Automating physical database design in a parallel database. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 558–69.Google Scholar
  22. 22.
    Roukh A, Bellatreche L, Boukorca A, Bouarar S. Eco-dmw: Eco-design methodology for data warehouses. In: Proceedings of the ACM 18th International Workshop on Data Warehousing and OLAP, DOLAP; 2015. p. 1–10.Google Scholar
  23. 23.
    Saccà D, Wiederhold G. Database partitioning in a cluster of processors. ACM Trans Database Syst. 1985;10(1):29–56.zbMATHCrossRefGoogle Scholar
  24. 24.
    Scheuermann P, Weikum G, Zabback P. Data partitioning and load balancing in parallel disk systems. VLDB J. 1998;7(1):48–66.CrossRefGoogle Scholar
  25. 25.
    Stohr T, Märtens H, Rahm E. Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 273–84.Google Scholar
  26. 26.
    Wolfson O, Jajodia S. Distributed algorithms for dynamic replication of data. In: Proceedings of the 11th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1992. p. 149–63.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Ladjel Bellatreche
    • 1
  • Todd Davis
    • 2
  • Belayadi Djahida
    • 3
  1. 1.LIAS/ISAE-ENSMAPoitiers UniversityFuturoscopeFrance
  2. 2.Department of Computer Science and Software EngineeringConcordia UniversityMontrealCanada
  3. 3.National High School for Computer Science (ESI)AlgiersAlgeria