Skip to main content

Implementing Window Functions in a Column-Store with Late Materialization

  • 580 Accesses

Part of the Lecture Notes in Computer Science book series (LNPSE,volume 11815)

Abstract

A window function is a generalization of the aggregation operation. Unlike aggregation, the cardinality of its output is always the same as the cardinality of input. That is, the semantics of this operator imply computing values for extra attributes for each row, depending on its context, either expressed by a sliding window or a previously evaluated row. Window functions are a very powerful tool, which is also popular among data analysts and supported by the majority of industrial DBMSes. It allows to gracefully express quite complex use-cases, such as running sums and averages, local maximum and minimum, and different types of ranking. Since they can be expressed without self-joins and correlated subqueries, their evaluation can be performed much more efficiently.

In this paper we discuss an implementation of window functions inside a disk-based column-store with late materialization. Late materialization is a technique that aims to keep tuple reconstruction back from individual columns as long as possible. Initially popular in the late 00’s, it is rarely considered nowadays. However, in case of window functions it allows to substantially lower memory footprint. Another contribution of this paper is the application of a segment tree to computing RANGE-based window functions.

Keywords

  • Window function
  • Analytical function
  • Aggregation
  • Column-store
  • Query processing
  • Late materialization
  • OLAP
  • PosDB

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-32065-2_21
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-32065-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)

Notes

  1. 1.

    Note that it is assumed here that frame offset does not depend on the current row value. Otherwise, the cumulative approach is still attractive, but not as dramatically.

  2. 2.

    In our implementation, several window functions can be processed at once if they are defined over the same window.

  3. 3.

    In an array-based implementation, we store just data without auxiliary information such as pointers to children which are necessary to describe an arbitrary binary tree.

  4. 4.

    https://xlinux.nist.gov/dads//HTML/perfectBinaryTree.html.

References

  1. CP-Algorithms: Segment Tree. https://cp-algorithms.com/data_structures/segment_tree.html

  2. Abadi, D., Boncz, P., Harizopoulos, S.: The Design and Implementation of Modern Column-Oriented Database Systems. Now Publishers Inc., Hanover (2013)

    Google Scholar 

  3. Bellamkonda, S., Bozkaya, T., Gupta, B.G.A., Haydu, J., Subramanian, S., Witkowski, A.: Analytic Functions in Oracle 8i. Technical report (2000). http://infolab.stanford.edu/infoseminar/archive/SpringY2000/speakers/agupta/paper.pdf

  4. Cao, Y., Chan, C.Y., Li, J., Tan, K.L.: Optimization of analytic window functions. Proc. VLDB Endow. 5(11), 1244–1255 (2012)

    CrossRef  Google Scholar 

  5. Chernishev, G.A., Galaktionov, V.A., Grigorev, V.D., Klyuchikov, E.S., Smirnov, K.K.: PosDB: an architecture overview. Programm. Comput. Softw. 44(1), 62–74 (2018)

    MathSciNet  CrossRef  Google Scholar 

  6. Chernishev, G., Galaktionov, V., Grigorev, V., Klyuchikov, E., Smirnov, K.: PosDB: a distributed column-store engine. In: Petrenko, A.K., Voronkov, A. (eds.) PSI 2017. LNCS, vol. 10742, pp. 88–94. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74313-4_7

    CrossRef  Google Scholar 

  7. Coelho, F., Pereira, J., Vilaça, R., Oliveira, R.: Holistic shuffler for the parallel processing of SQL window functions. In: Jelasity, M., Kalyvianaki, E. (eds.) DAIS 2016. LNCS, vol. 9687, pp. 75–81. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39577-7_6

    CrossRef  Google Scholar 

  8. Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2), 73–169 (1993)

    CrossRef  Google Scholar 

  9. Harizopoulos, S., Abadi, D., Boncz, P.: Column-Oriented Database Systems, VLDB 2009 Tutorial 2(2), 1664–1665 (2009). http://nms.csail.mit.edu/~stavros/pubs/tutorial2009- column_stores.pdf

  10. Jacobson, N.: Semi-Groups and Groups, pp. 15–48. Springer, New York (1951). https://doi.org/10.1007/978-1-4684-7301-8_2

  11. Leis, V., Kundhikanjana, K., Kemper, A., Neumann, T.: Efficient processing of window functions in analytical SQL queries. Proc. VLDB Endow. 8(10), 1058–1069 (2015)

    CrossRef  Google Scholar 

  12. O’Neil, P., Chen, X.: Star Schema Benchmark, June 2009. http://www.cs.umb.edu/~poneil/StarSchemaB.PDF

  13. Wesley, R., Xu, F.: Incremental computation of common windowed holistic aggregates. Proc. VLDB Endow. 9(12), 1221–1232 (2016)

    CrossRef  Google Scholar 

  14. Zuzarte, C., Pirahesh, H., Ma, W., Cheng, Q., Liu, L., Wong, K.: Winmagic: subquery elimination using window aggregation. In: SIGMOD 2003, pp. 652–656. ACM, New York (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Chernishev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Mukhaleva, N., Grigorev, V., Chernishev, G. (2019). Implementing Window Functions in a Column-Store with Late Materialization. In: Schewe, KD., Singh, N. (eds) Model and Data Engineering. MEDI 2019. Lecture Notes in Computer Science(), vol 11815. Springer, Cham. https://doi.org/10.1007/978-3-030-32065-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32065-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32064-5

  • Online ISBN: 978-3-030-32065-2

  • eBook Packages: Computer ScienceComputer Science (R0)