Abstract
Database clusters provide a cost-effective solutionn for high performance query processing. By using either inter- or intra-query parallelism on replicated data, they can accelerate individual queries and increase throughput. However, there is no database cluster that combines inter- and intra-query parallelism while supporting intensive update transactions. C-JDBC is a successful database cluster that offers inter-query parallelism and controls database replica consistency but cannot accelerate individual heavy-weight queries, typical of OLAP. In this paper, we propose the Apuama Engine, which adds intra-query parallelism to C-JDBC. The result is an open-source package that supports both OLTP and OLAP applications. We validated Apuama on a 32-node cluster running OLAP queries of the TPC-H benchmark on top of PostgreSQL. Our tests show that the Apuama Engine yields super-linear speedup and scale-up in read-only environments. Furthermore, it yields excellent performance under data update operations.
Work partially funded by CNPq, Finep, Capes, Cofecub and ACI “Massive Data” in France.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akal, F., Böhm, K., Schek, H.-J.: OLAP query evaluation in a database cluster: A performance study on intra-query parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 218–231. Springer, Heidelberg (2002)
Cecchet, E.: C-JDBC: a Middleware Framework for Database Clustering. Proceedings of IEEE Data Engineering Bulletin 27, 19–26 (2004)
Chaudhuri, S., Dayal, U.: An Overview of Data Warehousing and OLAP Technology. ACM SIGMOD Record 26, 65–74 (1997)
Coulon, C., Pacitti, E., Valduriez, P.: Scaling up the preventive replication of autonomous databases in cluster systems. In: Daydé, M., Dongarra, J., Hernández, V., Palma, J.M.L.M. (eds.) VECPAR 2004. LNCS, vol. 3402, pp. 170–183. Springer, Heidelberg (2005)
Cruanes, T., Dageville, B., Ghosh, B.: Parallel SQL Execution in Oracle 10g. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, pp. 850–854 (2004)
DB2 ICE (2005), http://ibm.com/software/data/db2/linux/ice (retrieved 11/09/2005)
MySQL 5.0 Documentation (2005), http://mysql.com (Retrieved 11/09/2005)
Gançarski, S., Naacke, H., Pacitti, E., Valduriez, P.: Parallel Processing with Autonomous Databases in a Cluster System. In: Proceedings of International Conference on Cooperative Information Systems (CoopIS), Los Angeles, USA, pp. 410–428 (2002)
Gorla, N.: Features to Consider in a Data Warehousing System. Communications of the ACM 46, 111–115 (2003)
HSQL Database Engine(2005), http://hsqldb.org/ (retrieved 11/09/2005)
JDBC (2005), java.sun.com/products/jdbc (retrieved 11/09/2005)
LGPL (2005), http://www.gnu.org/copyleft/lesser.html (retrieved on September 11, 2005)
Lima, A.A.B.: Intra-Query Parallelism in Database Clusters. COPPE/UFRJ, D.Sc.Thesis, Rio de Janeiro (2004)
Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster. In: Proceedings of 19h Brazilian Symposium on Databases (SBBD), Brasilia, Brazil, pp. 92–105 (2004)
Pape, C.L., Gançarski, S., Valduriez, P.: Refresco: Improving Query Performance Through Freshness Control in a Database Cluster. In: Proceedings of International Conference on Cooperative Information Systems (CoopIS), Agia Napa, Cyprus, pp. 174–193 (2004)
PostgreSQL 8.0.1 Documentation (2005), http://postgresql.org (retrieved 11/09/2005)
Paris Project (2005), http://www.irisa.fr/paris/General/cluster.htm (retrieved on September 11, 2005)
Röhm, U., Böhm, K., Schek, H.-J., Schuldt, H.: FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB), Hong Kong, China (2002) 754-765
TPC-H Benchmark (2005), http://tpc.org/tpch (retrieved on September 11, 2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miranda, B., Lima, A.A.B., Valduriez, P., Mattoso, M. (2006). Apuama: Combining Intra-query and Inter-query Parallelism in a Database Cluster. In: Grust, T., et al. Current Trends in Database Technology – EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 4254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11896548_49
Download citation
DOI: https://doi.org/10.1007/11896548_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46788-5
Online ISBN: 978-3-540-46790-8
eBook Packages: Computer ScienceComputer Science (R0)