Multi-level Parallel Query Execution Framework for CPU and GPU

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8133)


Recent developments have shown that classic database query execution techniques, such as the iterator model, are no longer optimal to leverage the features of modern hardware architectures. This is especially true for massive parallel architectures, such as many-core processors and GPUs. Here, the processing of single tuples in one step is not enough work to utilize the hardware resources and the cache efficiently and to justify the overhead introduced by iterators. To overcome these problems, we use just-in-time compilation to execute whole OLAP queries on the GPU minimizing the overhead for transfer and synchronization. We describe several patterns, which can be used to build efficient execution plans and achieve the necessary parallelism. Furthermore, we show that we can use similar processing models (and even the same source code) on GPUs and modern CPU architectures, but point out also some differences and limitations for query execution on GPUs. Results from our experimental evaluation using a TPC-H subset show that using these patterns we can achieve a speed-up of up to factor 5 compared to a CPU implementation.


Query Processing Hash Table Query Execution Iterator Model OLAP Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bakkum, P., Skadron, K.: Accelerating SQL database operations on a GPU with CUDA. In: Proc. 3rd Workshop on GPGPU, p. 94 (2010)Google Scholar
  2. 2.
    Boncz, P., Zukowski, M., Nes, N.: MonetDB/X100: Hyper-Pipelining Query Execution. In: Proc. CIDR, vol. 5 (2005)Google Scholar
  3. 3.
    Boncz, P.A., Kersten, M.L., Manegold, S.: Breaking the memory wall in monetdb. Commun. ACM 51(12), 77–85 (2008)CrossRefGoogle Scholar
  4. 4.
    Daga, M., Aji, A., Feng, W.: On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing. In: SAAHPC. IEEE (2011)Google Scholar
  5. 5.
    Dean, J., Ghemawat, S.: MapReduce. Comm. of the ACM 51(1), 107 (2008)CrossRefGoogle Scholar
  6. 6.
    Dees, J., Sanders, P.: Efficient Many-Core Query Execution in Main Memory Column-Stores. To appear on ICDE 2013 (2013)Google Scholar
  7. 7.
    Graefe, G.: Encapsulation of Parallelism in the Volcano Query Processing System, vol. 19. ACM (1990)Google Scholar
  8. 8.
    He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational Query Coprocessing on Graphics Processors. ACM Transactions on Database Systems 34 (2009)Google Scholar
  9. 9.
    He, B., Yu, J.X.: High-Throughput Transaction Executions on Graphics Processors. PVLDB 4, 314–325 (2011)Google Scholar
  10. 10.
    Kaldewey, T., Lohman, G., Mueller, R., Volk, P.: GPU Join Processing Revisited. In: Proc. 8th DaMoN (2012)Google Scholar
  11. 11.
    Kemper, A., Neumann, T.: HyPer: A hybrid OLTP&OLAP Main Memory Database System Based on Virtual Memory Snapshots. In: Proc. of ICDE (2011)Google Scholar
  12. 12.
    Krikellas, K., Viglas, S., Cintra, M.: Generating Code for Holistic Query Evaluation. In: Proc. of ICDE, pp. 613–624. IEEE (2010)Google Scholar
  13. 13.
    Lattner, C.: LLVM and Clang: Next Generation Compiler Technology. In: The BSD Conference (2008)Google Scholar
  14. 14.
    Neumann, T.: Efficiently Compiling Efficient Query Plans for Modern Hardware. Proc. of VLDB 4(9), 539–550 (2011)Google Scholar
  15. 15.
    NVidia: CUDA C Best Practices Guide (2012)Google Scholar
  16. 16.
    Valiant, L.G.: A Bridging Model for Parallel Computation. Comm. ACM 33 (1990)Google Scholar
  17. 17.
    Zukowski, M., Boncz, P.A.: From X100 to Vectorwise: Opportunities, Challenges and Things Most Researchers do not Think About. In: SIGMOD Conference (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Ilmenau University of TechnologyGermany
  2. 2.Karlsruhe Institute of TechnologyGermany
  3. 3.SAP AGGermany

Personalised recommendations