Abstract
Existing work on accelerating analytic DB query processing with (discrete) GPUs fails to fully realize their potential for speedup through parallelism: Published results do not achieve significant speedup over more performant CPU-only DBMSes when processing complete queries.
This paper presents a successful effort to better meet this challenge, in the form of a proof-of-concept query processing framework. The framework constitutes a graft onto an existing DBMS, altering some parts of it and replacing its execution engine entirely. It intensively refactors query execution plans, making them better-parallelizable, before executing them on either a CPU or on GPU. This results in a significant speedup even on a CPU, and a further speedup when using a GPU, over the chosen host DBMS (MonetDB) — which itself already bests most published results utilizing a GPU for query processing.
Finally, we outline some concrete future improvements on our results which can cut processing time by half and possibly much more.
Work carried out by all authors as members of the Heterogeneous Computing Group at Huawei Research, Israel. Authors appear in alphabetical order.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Q4 was chosen for this example for being a query with a short plan with few operations, but involving more than one table.
- 2.
LogicBlox figures normalized by 0.85 to account for HW differences.
References
Armbrust, M., Xin, R.S., Lian, C., Huai, Y., Liu, D., Bradley, J.K., Meng, X., Kaftan, T., Franklin, M.J., Ghodsi, A., Zaharia, M.: Spark SQL: relational data processing in spark. In: Proceedings of the SIGMOD, SIGMOD 2015, pp. 1383–1394. ACM (2015)
Bakkum, P., Chakradhar, S.: Efficient data management for GPU databases. NEC Laboratories America, Princeton, NJ, Technical report (2012)
Bakkum, P., Chakradhar, S.: Efficient data management for GPU databases. NEC Laboratories America, Princeton, NJ, Technical report [2]
Breß, S., Heimel, M., Siegmund, N., Bellatreche, L., Saake, G.: GPU-accelerated database systems: survey and open challenges. In: Proceedings of BigDataScience. ACM/IEEE (2014)
He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational query coprocessing on graphics processors. Trans. DB Sys. 34(4), 21:1–21:39 (2009)
Heimel, M., Saecker, M., Pirk, H., Manegold, S., Markl, V.: Hardware-oblivious parallelism for in-memory column-stores. In: Proceedings of VLDB, vol. 9, pp. 709–720 (2013)
Kemper, A., Neumann, T., Garching, D.: HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: Proceedings of ICDE (2011)
Luitjens, J.: Faster parallel reductions on Kepler (2014). http://devblogs.nvidia.com/parallelforall/faster-parallel-reductions-kepler/
Manegold, S., Kersten, M., Boncz, P.: Database architecture evolution: mammals flourished long before dinosaurs became extinct. Proc. VLDB 2(2), 1648–1653 (2009)
MonetDB webpage. http://www.monetdb.org
Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB 4(9), 539–550 (2011)
Paul, J., He, J., He, B.: GPL: A GPU-based pipelined query processing engine. In: Proceedings of SIGMOD. ACM (2016)
Power, J., Li, Y., Hill, M.D., Patel, J.M., Wood, D.A.: Toward GPUs being mainstream in analytic processing: an initial argument using simple scan-aggregate queries. In: Proceedings of DaMoN, p. 11. ACM (2015)
Sidirourgos, L., Kersten, M.: Column imprints: a secondary index structure. In: Proceedings of SIGMOD, pp. 893–904. ACM (2013)
Sitaridi, E.A., Ross, K.A.: GPU-accelerated string matching for database applications. J. VLDB, 1–22 (2015)
Stonebraker, M., Hellerstein, J., Bailis, P.: Readings in Database Systems (The Red Book), 5th edn (2015). http://www.redbook.io/
The CUB library. http://nvlabs.github.io/cub/
https://www.monetdb.org/Documentation/Manuals/MonetDB/MALreference
The TPC Council: TPC Benchmark H (rev 2.17.1) (2014). http://www.tpc.org/tpch
Wu, H., Diamos, G., Sheard, T., Aref, M., Baxter, S., Garland, M., Yalamanchili, S.: Red fox: an execution environment for relational query processing on GPUs. In: Proceedings of CGO, p. 44. ACM (2014)
Yong, K.K., Karuppiah, E.K., See, S.: Galactica: A GPU parallelized database accelerator. In: Proceedings of BigDataScience. ACM/IEEE (2014)
Yuan, Y., Lee, R., Zhang, X.: The Yin and Yang of processing data warehousing queries on GPU devices. Proc. VLDB 6(10), 817–828 (2013)
Zukowski, M., Boncz, P.: Vectorwise: beyond column stores. IEEE Data Eng. Bull. 35(1), 21–27 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Agbaria, A., Minor, D., Peterfreund, N., Rozenberg, E., Rosenberg, O. (2017). Overtaking CPU DBMSes with a GPU in Whole-Query Analytic Processing with Parallelism-Friendly Execution Plan Optimization. In: Blanas, S., Bordawekar, R., Lahiri, T., Levandoski, J., Pavlo, A. (eds) Data Management on New Hardware. ADMS IMDM 2016 2016. Lecture Notes in Computer Science(), vol 10195. Springer, Cham. https://doi.org/10.1007/978-3-319-56111-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-56111-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56110-3
Online ISBN: 978-3-319-56111-0
eBook Packages: Computer ScienceComputer Science (R0)