Database Scan Variants on Modern CPUs: A Performance Study

  • David BroneskeEmail author
  • Sebastian Breß
  • Gunter Saake
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8921)


Main-memory databases rely on highly tuned database operations to achieve peak performance. Recently, it has been shown that different code optimizations for database operations favor different processors. However, it is still not clear how the combination of code optimizations (e.g., loop unrolling and vectorization) will affect the performance of database algorithms on different processors.

In this paper, we extend prior studies by an in-depth performance analysis of different variants of the scan operator. We find that the performance of the scan operator for different processors gets even harder to predict when multiple code optimizations are combined. Since the scan is the most simple database operator, we expect the same effects for more complex operators such as joins. Based on these results, we identify practical problems for a query processor and discuss how we can counter these challenges in future work.


Parallel Algorithm Selectivity Factor Code Optimization Cache Line Position List 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We thank Jens Teubner from TU Dortmund and Max Heimel from TU Berlin for helpful feedback and discussions.


  1. 1.
    Albutiu, M.C., Kemper, A., Neumann, T.: Massively parallel sort-merge joins in main memory multi-core database systems. PVLDB 5(10), 1064–1075 (2012)Google Scholar
  2. 2.
    Balkesen, C., Alonso, G., Teubner, J., Özsu, M.T.: Multi-core, main-memory joins: sort vs. hash revisited. PVLDB 7(1), 85–96 (2013)Google Scholar
  3. 3.
    Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core CPUs: tuning to the underlying hardware. In: ICDE, pp. 362–373 (2013)Google Scholar
  4. 4.
    Breß, S., Beier, F., Rauhe, H., Sattler, K.U., Schallehn, E., Saake, G.: Efficient co-processor utilization in database query processing. Inf. Sys. 38(8), 1084–1096 (2013)CrossRefGoogle Scholar
  5. 5.
    Breß, S., Siegmund, N., Heimel, M., Saecker, M., Lauer, T., Bellatreche, L., Saake, G.: Load-aware inter-co-processor parallelism in database query processing. Data Knowl. Eng. (2014). doi: 10.1016/j.datak.2014.07.003
  6. 6.
    Broneske, D., Breß, S., Heimel, M., Saake, G.: Toward hardware-sensitive database operations. In: EDBT, pp. 229–234 (2014)Google Scholar
  7. 7.
    Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: SC, pp. 1–12 (2008)Google Scholar
  8. 8.
    Heimel, M., Markl, V.: A first step towards GPU-assisted query optimization. In: ADMS, pp. 33–44 (2012)Google Scholar
  9. 9.
    Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 4th edn. Morgan Kaufmann Publishers Inc, San Francisco (2007)zbMATHGoogle Scholar
  10. 10.
    Intel: Intel 64 and IA-32 Architectures Optimization Reference Manual (April 2012).
  11. 11.
    Kim, C., Chhugani, J., Satish, N., Sedlar, E., Nguyen, A.D., Kaldewey, T., Lee, V.W., Brandt, S.A., Dubey, P.: FAST: fast architecture sensitive tree search on modern CPUs and GPUs. In: SIGMOD, pp. 339–350 (2010)Google Scholar
  12. 12.
    Leis, V., Boncz, P., Kemper, A., Neumann, T.: Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age. In: SIGMOD, pp. 743–754 (2014)Google Scholar
  13. 13.
    Li, Y., Patel, J.M.: BitWeaving: fast scans for main memory data processing. In: SIGMOD, pp. 289–300 (2013)Google Scholar
  14. 14.
    Markl, V., Lohman, G.M., Raman, V.: LEO: an autonomic query optimizer for DB2. IBM Syst. J. 42(1), 98–106 (2003)CrossRefGoogle Scholar
  15. 15.
    Markl, V., Raman, V., Simmen, D., Lohman, G., Pirahesh, H., Cilimdzic, M.: Robust query processing through progressive optimization. In: SIGMOD, pp. 659–670 (2004)Google Scholar
  16. 16.
    Ross, K.A.: Selection conditions in main-memory. TODS 29, 132–161 (2004)CrossRefGoogle Scholar
  17. 17.
    Rǎducanu, B., Boncz, P., Zukowski, M.: Micro adaptivity in vectorwise. In: SIGMOD, pp. 1231–1242 (2013)Google Scholar
  18. 18.
    Teubner, J., Mueller, R., Alonso, G.: Frequent item computation on a chip. TDKE 23(8), 1169–1181 (2011)Google Scholar
  19. 19.
    Willhalm, T., Boshmaf, Y., Plattner, H., Popovici, N., Zeier, A., Schaffner, J.: SIMD-Scan: ultra fast in-memory table scan using on-chip vector processing units. PVLDB 2(1), 385–394 (2009)Google Scholar
  20. 20.
    Willhalm, T., Oukid, I., Müller, I., Faerber, F.: Vectorizing database column scans with complex predicates. In: ADMS, pp. 1–12 (2013)Google Scholar
  21. 21.
    Zeuch, S., Freytag, J.C., Huber, F.: Adapting tree structures for processing with SIMD instructions. In: EDBT, pp. 97–108 (2014)Google Scholar
  22. 22.
    Zhou, J., Ross, K.A.: Implementing database operations using SIMD instructions. In: SIGMOD, pp. 145–156 (2002)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • David Broneske
    • 1
    Email author
  • Sebastian Breß
    • 1
    • 2
  • Gunter Saake
    • 1
  1. 1.University of MagdeburgMagdeburgGermany
  2. 2.TU Dortmund UniversityDortmundGermany

Personalised recommendations