Skip to main content

OLAP*: Effectively and Efficiently Supporting Parallel OLAP over Big Data

  • Conference paper
Model and Data Engineering (MEDI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8216))

Included in the following conference series:

Abstract

In this paper, we investigate solutions relying on data partitioning schemes for parallel building of OLAP data cubes, suitable to novel Big Data environments, and we propose the framework OLAP*, along with the associated benchmark TPC-H*d, a suitable transformation of the well-known data warehouse benchmark TPC-H. We demonstrate through performance measurements the efficiency of the proposed framework, developed on top of the ROLAP server Mondrian.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Discov. 1, 29–53 (1997)

    Article  Google Scholar 

  2. Nicole, L.: Business intelligence software market continues to grow (2011), http://www.gartner.com/it/page.jsp?id=1553215

  3. Vassiliadis, P.: Modeling multidimensional databases, cubes and cube operations. In: Proceedings of the 10th International Conference on Scientific and Statistical Database Management, SSDBM 1998, pp. 53–62. IEEE Computer Society (1998)

    Google Scholar 

  4. DeWitt, D.J., Madden, S., Stonebraker, M.: How to build a high-performance data warehouse (2005), http://db.lcs.mit.edu/madden/highperf.pdf

  5. Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: current state and future opportunities. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 530–533. ACM (2011)

    Google Scholar 

  6. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Anthony, S., Liu, H., Murthy, R.: Hive - A petabyte scale data warehouse using hadoop. In: ICDE (2010)

    Google Scholar 

  7. Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, DOLAP 2011, pp. 101–104. ACM (2011)

    Google Scholar 

  8. Transaction Processing Council: TPC-H benchmark (2013), http://www.tpc.org/tpch

  9. Pentaho: Mondrian ROLAP Server (2013), http://mondrian.pentaho.org/

  10. Bellatreche, L., Cuzzocrea, A., Benkrid, S.: \(\mathcal{F}\)&\(\mathcal{A}\): A methodology for effectively and efficiently designing parallel relational data warehouses on heterogenous database clusters. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 89–104. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Bellatreche, L., Benkrid, S., Ghazal, A., Crolotte, A., Cuzzocrea, A.: Verification of partitioning and allocation techniques on teradata DBMS. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part I. LNCS, vol. 7016, pp. 158–169. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Bellatreche, L., Cuzzocrea, A., Benkrid, S.: Effectively and efficiently designing and querying parallel relational data warehouses on heterogeneous database clusters: The f&a approach. Journal of Database Management (JDM), 17–51 (2012)

    Google Scholar 

  13. Akal, F., Böhm, K., Schek, H.-J.: OLAP query evaluation in a database cluster: A performance study on intra-query parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 218–231. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  14. Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive virtual partitioning for olap query processing in a database cluster. JIDM 1, 75–88 (2010)

    Google Scholar 

  15. Röhm, U., Böhm, K., Schek, H.-J.: OLAP query routing and physical design in a database cluster. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 254–268. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  16. Bellatreche, L., Woameno, K.Y.: Dimension table driven approach to referential partition relational data warehouses. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 9–16. ACM (2009)

    Google Scholar 

  17. Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and randomized optimization for the join ordering problem. The VLDB Journal 6, 191–208 (1997)

    Article  Google Scholar 

  18. Stöhr, T., Martens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 273–284 (2000)

    Google Scholar 

  19. Chen, Y., Rau-Chaplin, A., Dehne, F., Eavis, T., Green, D., Sithirasenan, E.: cgmOLAP: Efficient parallel generation and querying of terabyte size rolap data cubes. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, pp. 359–370 (2006)

    Google Scholar 

  20. Paes, M., Lima, A.A., Valduriez, P., Mattoso, M.: High-performance query processing of a real-world OLAP database with pargres. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds.) VECPAR 2008. LNCS, vol. 5336, pp. 188–200. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  21. Lima, A.A., Furtado, C., Valduriez, P., Mattoso, M.: Parallel OLAP query processing in database clusters with data replication. Distrib. Parallel Databases 25, 97–123 (2009)

    Article  Google Scholar 

  22. Microsoft: Multi-dimensional expressions language (2013), http://msdn.microsoft.com/enus/library/aa216779SQL.80.aspx

  23. Stöhr, T., Rahm, E.: Warlock: A data allocation tool for parallel warehouses. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 721–722 (2001)

    Google Scholar 

  24. JPivot: OLAP client (2013), http://jpivot.sourceforge.net/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cuzzocrea, A., Moussa, R., Xu, G. (2013). OLAP*: Effectively and Efficiently Supporting Parallel OLAP over Big Data. In: Cuzzocrea, A., Maabout, S. (eds) Model and Data Engineering. MEDI 2013. Lecture Notes in Computer Science, vol 8216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41366-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41366-7_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41365-0

  • Online ISBN: 978-3-642-41366-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics