Implementing Window Functions in a Column-Store with Late Materialization
A window function is a generalization of the aggregation operation. Unlike aggregation, the cardinality of its output is always the same as the cardinality of input. That is, the semantics of this operator imply computing values for extra attributes for each row, depending on its context, either expressed by a sliding window or a previously evaluated row. Window functions are a very powerful tool, which is also popular among data analysts and supported by the majority of industrial DBMSes. It allows to gracefully express quite complex use-cases, such as running sums and averages, local maximum and minimum, and different types of ranking. Since they can be expressed without self-joins and correlated subqueries, their evaluation can be performed much more efficiently.
In this paper we discuss an implementation of window functions inside a disk-based column-store with late materialization. Late materialization is a technique that aims to keep tuple reconstruction back from individual columns as long as possible. Initially popular in the late 00’s, it is rarely considered nowadays. However, in case of window functions it allows to substantially lower memory footprint. Another contribution of this paper is the application of a segment tree to computing RANGE-based window functions.
KeywordsWindow function Analytical function Aggregation Column-store Query processing Late materialization OLAP PosDB
- 1.CP-Algorithms: Segment Tree. https://cp-algorithms.com/data_structures/segment_tree.html
- 2.Abadi, D., Boncz, P., Harizopoulos, S.: The Design and Implementation of Modern Column-Oriented Database Systems. Now Publishers Inc., Hanover (2013)Google Scholar
- 3.Bellamkonda, S., Bozkaya, T., Gupta, B.G.A., Haydu, J., Subramanian, S., Witkowski, A.: Analytic Functions in Oracle 8i. Technical report (2000). http://infolab.stanford.edu/infoseminar/archive/SpringY2000/speakers/agupta/paper.pdf
- 9.Harizopoulos, S., Abadi, D., Boncz, P.: Column-Oriented Database Systems, VLDB 2009 Tutorial 2(2), 1664–1665 (2009). http://nms.csail.mit.edu/~stavros/pubs/tutorial2009- column_stores.pdf
- 10.Jacobson, N.: Semi-Groups and Groups, pp. 15–48. Springer, New York (1951). https://doi.org/10.1007/978-1-4684-7301-8_2
- 12.O’Neil, P., Chen, X.: Star Schema Benchmark, June 2009. http://www.cs.umb.edu/~poneil/StarSchemaB.PDF
- 14.Zuzarte, C., Pirahesh, H., Ma, W., Cheng, Q., Liu, L., Wong, K.: Winmagic: subquery elimination using window aggregation. In: SIGMOD 2003, pp. 652–656. ACM, New York (2003)Google Scholar