The VLDB Journal

, Volume 25, Issue 1, pp 27–52 | Cite as

An experimental evaluation and analysis of database cracking

  • Felix Martin Schuhknecht
  • Alekh Jindal
  • Jens Dittrich
Special Issue Paper

Abstract

Database cracking has been an area of active research in recent years. The core idea of database cracking is to create indexes adaptively and incrementally as a side product of query processing. Several works have proposed different cracking techniques for different aspects including updates, tuple reconstruction, convergence, concurrency control, and robustness. Our 2014 VLDB paper “The Uncracked Pieces in Database Cracking” (PVLDB 7:97–108, 2013/VLDB 2014) was the first comparative study of these different methods by an independent group. In this article, we extend our published experimental study on database cracking and bring it to an up-to-date state. Our goal is to critically review several aspects, identify the potential, and propose promising directions in database cracking. With this study, we hope to expand the scope of database cracking and possibly leverage cracking in database engines other than MonetDB. We repeat several prior database cracking works including the core cracking algorithms as well as three other works on convergence (hybrid cracking), tuple reconstruction (sideways cracking), and robustness (stochastic cracking), respectively. Additionally to our conference paper, we now also look at a recently published study about CPU efficiency (predication cracking). We evaluate these works and show possible directions to do even better. As a further extension, we evaluate the whole class of parallel cracking algorithms that were proposed in three recent works. Altogether, in this work we revisit 8 papers on database cracking and evaluate in total 18 cracking methods, 6 sorting algorithms, and 3 full index structures. Additionally, we test cracking under a variety of experimental settings, including high selectivity (Low selectivity means that many entries qualify. Consequently, a high selectivity means, that only few entries qualify) queries, low selectivity queries, varying selectivity, and multiple query access patterns. Finally, we compare cracking against different sorting algorithms as well as against different main memory optimized indexes, including the recently proposed adaptive radix tree (ART). Our results show that: (1) the previously proposed cracking algorithms are repeatable, (2) there is still enough room to significantly improve the previously proposed cracking algorithms, (3) parallelizing cracking algorithms efficiently is a hard task, (4) cracking depends heavily on query selectivity, (5) cracking needs to catch up with modern indexing trends, and (6) different indexing algorithms have different indexing signatures.

Keywords

Adaptive indexing Database cracking Sorting Multi-threaded algorithms 

References

  1. 1.
    Adelson-Velsky, G., et al.: An algorithm for the organization of information. In: USSR Academy of Sciences, pp. 263–266 (1962)Google Scholar
  2. 2.
    Alvarez, V., Schuhknecht, F.M., Dittrich, J., Richter, S.: Main memory adaptive indexing for multi-core systems. In: DaMoN, Snowbird, UT, USA, pp. 3:1–3:10 (2014)Google Scholar
  3. 3.
    Bayer, R., McCreight, E.M.: Organization and maintenance of large ordered indices. Acta Inf. 1, 173–189 (1972)Google Scholar
  4. 4.
    Birkeland, O.R.: Searching large data volumes with MISD processing. Ph.D. Thesis (2008)Google Scholar
  5. 5.
    DeWitt, D.J., Naughton, J.F., et al.: Practical skew handling in parallel joins. In: VLDB, Proceedings, pp. 27–40 (1992)Google Scholar
  6. 6.
    Finch, T.: Incremental Calculation of Weighted Mean and Variance. University of Cambridge Computing Service, Cambridge (2009)Google Scholar
  7. 7.
    Generalized Heap Impl. https://github.com/valyala/gheap
  8. 8.
    Graefe, G., Halim, F., Idreos, S., et al.: Concurrency control for adaptive indexing. PVLDB 5, 656–667 (2012)Google Scholar
  9. 9.
    Graefe, G., Halim, F., Idreos, S., et al.: Transactional support for adaptive indexing. VLDB J. 23(2), 303–328 (2014)Google Scholar
  10. 10.
    Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: EDBT, pp. 371–381 (2010)Google Scholar
  11. 11.
    Halim, F., Idreos, S., et al.: Stochastic database cracking: towards robust adaptive indexing in main-memory column-stores. PVLDB 5, 502–513 (2012)Google Scholar
  12. 12.
    Hildebrandt, P., Isbitz, H.: Radix exchange: an internal sorting method for digital computers. J. ACM 6(2), 156–163 (1959)MATHMathSciNetCrossRefGoogle Scholar
  13. 13.
    Hoare, C.A.R.: Quicksort. Commun. ACM 4(7), 321 (1961)CrossRefGoogle Scholar
  14. 14.
    Idreos, S., Kersten, M., Manegold, S.: Updating a cracked database. In: SIGMOD, pp. 413–424 (2007)Google Scholar
  15. 15.
    Idreos, S., Kersten, M., Manegold, S.: Self-organizing tuple reconstruction in column-stores. In: SIGMOD, pp. 297–308 (2009)Google Scholar
  16. 16.
    Idreos, S., Manegold, S., et al.: Merging what’s cracked, cracking what’s merged. PVLDB 4, 586–597 (2011)Google Scholar
  17. 17.
    Idreos, S., et al.: Database cracking. In: CIDR, pp. 68–78 (2007)Google Scholar
  18. 18.
    Kersten, M., et al.: Cracking the database store. In: CIDR, pp. 213–224 (2005)Google Scholar
  19. 19.
    Kim, C., et al.: FAST: Fast architecture sensitive tree search on modern CPUs and GPUs. In: SIGMOD, pp. 339–350 (2010)Google Scholar
  20. 20.
    Leis, V., et al.: The adaptive radix tree: ARTful indexing for main-memory databases. In: ICDE, pp. 38–49 (2013)Google Scholar
  21. 21.
    Martinez-Palau, X., Dominguez-Sal, D., et al.: Two-way replacement selection. PVLDB 3, 871–881 (2010)Google Scholar
  22. 22.
    McCalpin, J.D.: STREAM benchmark, version from January 17. https://www.cs.virginia.edu/stream/FTP/Code/stream.c (2013)
  23. 23.
    Pirk, H., Petraki, E., Idreos, S., Manegold, S., Kersten, M.L.: Database cracking: fancy scan, not poor man’s sort! In: DaMoN, Snowbird, UT, USA, pp. 4:1–4:8 (2014)Google Scholar
  24. 24.
    Rao, J., Ross, K.A.: Making B+-trees cache conscious in main memory. In: SIGMOD, pp. 475–486 (2000)Google Scholar
  25. 25.
    Schuhknecht, F.M., Jindal, A., Dittrich, J.: The uncracked pieces in database cracking. PVLDB 7, 97–108 (2013)Google Scholar
  26. 26.
    Schuhknecht, F.M., Khanchandani, P., Dittrich, J.: On the surprising difficulty of simple things: the case of radix partitioning. PVLDB 8, 934–937 (2015)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Information Systems GroupSaarland UniversitySaarbrückenGermany
  2. 2.CSAILMITCambridgeUSA

Personalised recommendations