Advertisement

Applied Intelligence

, Volume 49, Issue 2, pp 420–434 | Cite as

OLAP cube partitioning based on association rules method

  • Khadija LetracheEmail author
  • Omar El Beggar
  • Mohammed Ramdani
Article
  • 66 Downloads

Abstract

Partitioning is an optimization method used in Business intelligence (BI) systems to improve query and processing performances. That is why most of BI vendors integrate partitioning functionality in their solutions. However, they do not provide partitioning strategies which remain a serious challenge for BI administrators. Some works in the literature have proposed algorithms and strategies for Data warehouse partitioning. Nevertheless, most of them focused on the relational data warehouse partitioning and ignore the OLAP cubes although they are the first concerned by the user multidimensional queries. To deal with this, we propose in this paper a dynamic partitioning strategy for OLAP cubes based on the association rules algorithm. The first step in the proposal consists on analyzing the user queries for a specific period with a view to finding the frequent predicates itemsets. Afterwards, we use our proposed algorithm based on the association rules method to partition the data cube according to the frequent predicates itemsets obtained in the first step. Finally, we present a case study and experiences results to evaluate and validate our approach.

Keywords

Data warehouse Partition OLAP cube Association rules Cube maintenance Cube performance 

References

  1. 1.
    Inmon WH (2005) Building the data warehouse. Wiley, New YorkGoogle Scholar
  2. 2.
    Vaisman A, Zimányi E (2014) Data warehouse systems design and implementation. Springer, BerlinGoogle Scholar
  3. 3.
    AlHammad N, Taha Y (2016) Performance Evaluation Study of Data Retrieval in Data Warehouse Environment. ICCIP ’16 ACM, SingaporeGoogle Scholar
  4. 4.
    Kimball R, Ross M (2002) The data warehouse toolkit second edition the complete guide to dimensional modeling. Wiley, New YorkGoogle Scholar
  5. 5.
    Common Warehouse Metamodel (CWM) Specification Version 1.1, Volume 1 (March 2003)Google Scholar
  6. 6.
    Meta Data Coalition Open Information Model Version 1.1 (August, 1999)Google Scholar
  7. 7.
    Han J, Kamber M (2006) Data Mining. Elsevier, AmsterdamzbMATHGoogle Scholar
  8. 8.
    Bellatreche L, Boukhalfa K (2005) An evolutionary approach to schema partitioning selection in a data warehouse. In: Proceedings of the 7th International Conference DaWaK. LNCS, vol 3589. Springer, Berlin, pp 115–125Google Scholar
  9. 9.
    Bellatreche L, Boukhalfa K, Richard P (2009) Referential horizontal partitioning selection problem in data warehouses: hardness study and selection algorithms. Int J Data Warehouse Min 5(4):1–23CrossRefGoogle Scholar
  10. 10.
    Hamdi I, Bouazizi E, Alshomrani S, Feki J (2015) 2LPA-RTDW: a two-level data partitioning approach for real-time data warehouse. Computer and Information Science (ICIS). IEEE, Las VegasGoogle Scholar
  11. 11.
    Baluch O, Eavis T (2014) Soft real-time OLAP: exploiting modern hardware without breaking the bank. In: 43rd international conference IEEE parallel processing workshops (ICCPW),Google Scholar
  12. 12.
    Lima A, Furtado C, Valduriez P, Mattoso M (2009) Parallel OLAP query processing in database clusters with data replication. Distrib Parallel Databases 25:97–123CrossRefGoogle Scholar
  13. 13.
    Sun L, Krishnan S, Xin RS, Franklin MJ (2014) A partitioning framework for aggressive data skipping. In: International conference on very large data bases, HangzhouGoogle Scholar
  14. 14.
    Toumi L, Moussaoui A, Ugur A (2015) EMeD-part: an efficient methodology for horizontal partitioning in data warehouses. In: ACM IPAC ’15. BatnaGoogle Scholar
  15. 15.
    Grund M, Krueger J, Mueller J, Zeier A, Plattner H (2011) Dynamic partitioning for enterprise applications. In: Proceedings of IEEE IEEM, pp 1010–1015Google Scholar
  16. 16.
    Bellatreche L, Bouchakri R, Cuzzocrea A, Maabout S (2013) Horizontal partitioning of very-large data warehouses under dynamically-changing query workloads via incremental algorithms. In: SAC’13 proceedings of the 28th annual ACM symposium on applied computing, pp 208–210Google Scholar
  17. 17.
    Bouchakri R, Bellatreche L, Faget Z, Breß S (2014) A coding template for handling static and incremental horizontal partitioning in data warehouses. J Decis Syst 23:4, 481–498CrossRefGoogle Scholar
  18. 18.
    Rodriguez L, Li X (2011) A support-based vertical partitioning method for database design. In: 2011 8th international conference on electrical engineering computing science and automatic control (CCE), pp 1–6Google Scholar
  19. 19.
    Bouakkaz M, Ouinten Y, Ziani B (2012) Vertical fragmentation of data warehouses using the FP-Max algorithm. In: 2012 international conference on innovations in information technology (IIT), pp 273–276Google Scholar
  20. 20.
    Arres B, Kabachi N, Boussaid O (2015) A data pre-partitioning and distribution optimization approach for distributed datawarehouses. In: Proceedings of the international conference on parallel and distributed processing techniques and applications (PDPTA), Athens, pp 454–461Google Scholar
  21. 21.
    Kim JW, Cho SH, Kim I-M (2016) Workload-based column partitioning to efficiently process data warehouse query. Int J Appl Eng Res 11(2):917–921Google Scholar
  22. 22.
    Ahmed S, Coenen F, Leng P (2006) Tree-based partitioning of date for association rule mining. Knowl Inf Syst 315–331Google Scholar
  23. 23.
    Patil DV (2015) Reducing data skew with round robin horizontal partitioning of data for distributed association rule mining of large data set. IJETTGoogle Scholar
  24. 24.
    Le-Khac NA, Kechadi MT, Carthy J (2006) ADMIRE framework: distributed data mining on data grid platforms. In: Proceedings of the first international conference on software and data technologies. ICSOFTGoogle Scholar
  25. 25.
    Gorla N (2003) Features to consider in a data warehousing system. Commun ACM 46(11):111–115CrossRefGoogle Scholar
  26. 26.
    Cheung DW, Zhou B, Kao B, Kan H, Lee SD (2001) Towards the building of a dense-region-based OLAP system. Data Knowl Eng 36:1–27CrossRefzbMATHGoogle Scholar
  27. 27.
    Partitions (Analysis Services - Multidimensional Data) https://msdn.microsoft.com/en-us/library/ms175688.aspx. Accessed: 21 Sep 2017
  28. 28.
    SAS 9.1.3 OLAP Server: MDX Guide, Second Ed - SAS Support, MDX Introduction and OverviewGoogle Scholar
  29. 29.
    Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIG MOD Conference. Washington DC, USAGoogle Scholar
  30. 30.
    Ben Messaoud R, SL Rabasda, Boussaid O, Missaoui R (2006) Enhanced mining of association rules from data Cubes. In: DOLAP’06, November 10, 2006. Arlington, USAGoogle Scholar
  31. 31.
    Ponniah P (2001) Data warehousing fundamentals: a comprehensive guide for IT professionalsGoogle Scholar
  32. 32.
    Shukla A, Deshpande P, Naughton JF (1996) Storage estimation for multidimensional aggregates in the presence of hierarchies, http://ai2-s2-pdfs.s3.amazonaws.com
  33. 33.
    TPC-DS database: http://www.tpc.org/tpcds. Accessed: 21 Nov 2017
  34. 34.
    Letrache K, El Beggar O, Ramdani M (2017) The automatic creation of OLAP cube using an MDA approach. Softw: Pract Exp 47(12):1887–1903Google Scholar
  35. 35.
    El Beggar O, Letrache K, Ramdani M (2017) CIM for data warehouse requirements using an UML profile. IET Softw 11(4):181–194Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Informatics Department, LIM Laboratory, Faculty of Sciences and Techniques of MohammediaUniversity Hassan IICasablancaMorocco

Personalised recommendations