Abstract
Cost models are an essential part of database systems, as they are the basis of query performance optimization. Disk based systems are well understood and sophisticated models exist to compare various data structures and to estimate query costs based on disk IO operations. Cost models for in-memory databases shift the focus from disk IOs to main memory accesses and CPU costs. However, modeling memory accesses is fundamentally different and common models do not apply anymore.
In this work, we examine the plan operations scan with equality selection, scan with range selection, positional lookup and insert in in-memory column stores regarding different physical column organizations. We consider uncompressed columns, bit compressed and dictionary encoded columns with sorted and unsorted dictionaries. Furthermore, we discuss tree indices on columns and dictionaries and present a detailed parameter evaluation, considering the number of distinct values, value skewness and value disorder. Finally, we present and evaluate a cost model based on cache misses for estimating the runtime of the discussed plan operations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Babka, V., et al.: Investigating Cache Parameters of x86 Family Processors. In: SPEC (2009)
Barr, T., et al.: Translation Caching: Skip, Don’t Walk (the Page Table). ACM SIGARCH (2010)
Drepper, U.: What Every Programmer Should Know About Memory (2007)
Grund, M., et al.: HYRISE—A Main Memory Hybrid Storage Engine. VLDB (2010)
Hübner, et al: A cost-aware strategy for merging differential stores in column-oriented in-memory DBMS. BIRTE Workshop (2011)
Kemper, A., Neumann, T.: HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: ICDE (2011)
Knuth, D.E.: Art of Computer Programming, vol. 3: Sorting and Searching. Addison-Wesley Professional (1973)
Krüger, J., et al.: Fast Updates on Read Optimized Databases Using Multi Core CPUs. VLDB (2011)
MacNicol, R., French, B.: Sybase IQ Multiplex — Designed For Analytics. VLDB (2004)
Manegold, S.: et al. Generic database cost models for hierarchical memory systems. VLDB (2002)
Moore, G.: Cramming more components onto integrated circuits. Electronics 38 (1965)
Plattner, H.: A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database. Sigmod (2009)
Plattner, H., et al.: In-Memory Data Management: An Inflection Point for Enterprise Applications (2011)
Zukowski, M., et al.: MonetDB/X100 - A DBM in The CPU Cache. IEEE Data Eng. Bull. (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schwalb, D., Faust, M., Krueger, J., Plattner, H. (2013). Physical Column Organization in In-Memory Column Stores. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7826. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37450-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-37450-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37449-4
Online ISBN: 978-3-642-37450-0
eBook Packages: Computer ScienceComputer Science (R0)