Abstract
In this paper, we investigate solutions relying on data partitioning schemes for parallel building of OLAP data cubes, suitable to novel Big Data environments, and we propose the framework OLAP*, along with the associated benchmark TPC-H*d, a suitable transformation of the well-known data warehouse benchmark TPC-H. We demonstrate through performance measurements the efficiency of the proposed framework, developed on top of the ROLAP server Mondrian.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Discov. 1, 29–53 (1997)
Nicole, L.: Business intelligence software market continues to grow (2011), http://www.gartner.com/it/page.jsp?id=1553215
Vassiliadis, P.: Modeling multidimensional databases, cubes and cube operations. In: Proceedings of the 10th International Conference on Scientific and Statistical Database Management, SSDBM 1998, pp. 53–62. IEEE Computer Society (1998)
DeWitt, D.J., Madden, S., Stonebraker, M.: How to build a high-performance data warehouse (2005), http://db.lcs.mit.edu/madden/highperf.pdf
Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: current state and future opportunities. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 530–533. ACM (2011)
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Anthony, S., Liu, H., Murthy, R.: Hive - A petabyte scale data warehouse using hadoop. In: ICDE (2010)
Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, DOLAP 2011, pp. 101–104. ACM (2011)
Transaction Processing Council: TPC-H benchmark (2013), http://www.tpc.org/tpch
Pentaho: Mondrian ROLAP Server (2013), http://mondrian.pentaho.org/
Bellatreche, L., Cuzzocrea, A., Benkrid, S.: \(\mathcal{F}\)&\(\mathcal{A}\): A methodology for effectively and efficiently designing parallel relational data warehouses on heterogenous database clusters. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 89–104. Springer, Heidelberg (2010)
Bellatreche, L., Benkrid, S., Ghazal, A., Crolotte, A., Cuzzocrea, A.: Verification of partitioning and allocation techniques on teradata DBMS. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part I. LNCS, vol. 7016, pp. 158–169. Springer, Heidelberg (2011)
Bellatreche, L., Cuzzocrea, A., Benkrid, S.: Effectively and efficiently designing and querying parallel relational data warehouses on heterogeneous database clusters: The f&a approach. Journal of Database Management (JDM), 17–51 (2012)
Akal, F., Böhm, K., Schek, H.-J.: OLAP query evaluation in a database cluster: A performance study on intra-query parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 218–231. Springer, Heidelberg (2002)
Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive virtual partitioning for olap query processing in a database cluster. JIDM 1, 75–88 (2010)
Röhm, U., Böhm, K., Schek, H.-J.: OLAP query routing and physical design in a database cluster. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 254–268. Springer, Heidelberg (2000)
Bellatreche, L., Woameno, K.Y.: Dimension table driven approach to referential partition relational data warehouses. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 9–16. ACM (2009)
Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and randomized optimization for the join ordering problem. The VLDB Journal 6, 191–208 (1997)
Stöhr, T., Martens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 273–284 (2000)
Chen, Y., Rau-Chaplin, A., Dehne, F., Eavis, T., Green, D., Sithirasenan, E.: cgmOLAP: Efficient parallel generation and querying of terabyte size rolap data cubes. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, pp. 359–370 (2006)
Paes, M., Lima, A.A., Valduriez, P., Mattoso, M.: High-performance query processing of a real-world OLAP database with pargres. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds.) VECPAR 2008. LNCS, vol. 5336, pp. 188–200. Springer, Heidelberg (2008)
Lima, A.A., Furtado, C., Valduriez, P., Mattoso, M.: Parallel OLAP query processing in database clusters with data replication. Distrib. Parallel Databases 25, 97–123 (2009)
Microsoft: Multi-dimensional expressions language (2013), http://msdn.microsoft.com/enus/library/aa216779SQL.80.aspx
Stöhr, T., Rahm, E.: Warlock: A data allocation tool for parallel warehouses. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 721–722 (2001)
JPivot: OLAP client (2013), http://jpivot.sourceforge.net/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cuzzocrea, A., Moussa, R., Xu, G. (2013). OLAP*: Effectively and Efficiently Supporting Parallel OLAP over Big Data. In: Cuzzocrea, A., Maabout, S. (eds) Model and Data Engineering. MEDI 2013. Lecture Notes in Computer Science, vol 8216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41366-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-41366-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41365-0
Online ISBN: 978-3-642-41366-7
eBook Packages: Computer ScienceComputer Science (R0)