Optimizing Window Aggregate Functions in Relational Database Systems

Song, Guangxuan; Ma, Jiansong; Wang, Xiaoling; Jin, Cheqing; Cao, Yu

doi:10.1007/978-3-319-55753-3_22

Guangxuan Song¹⁸,
Jiansong Ma¹⁸,
Xiaoling Wang¹⁸,
Cheqing Jin¹⁸ &
…
Yu Cao¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10177))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2809 Accesses
1 Citations

Abstract

The window function has become an important OLAP extension of SQL since SQL:2003, and is supported by major commercial RDBMSs (e.g. Oracle, DB2, SQL Server, Teradata and Pivotal Greenplum) and by emerging Big Data platforms (e.g. Google Tenzing, Apache Hive, Pivotal HAWQ and Cloudera Impala). Window functions are designed for advanced data analytics use cases, bringing significant functional and performance enhancements to OLAP and decision support applications. However, we identify that existing window function evaluation approaches are still with significant room for improvement. In this paper, we revisit the conventional two-phase evaluation framework for window functions in relational databases, and propose several novel optimization techniques which aim to minimize the redundant data accesses and computations during the function calls invoked over window frames. We have integrated the proposed techniques into PostgreSQL, and compared them with both PostgreSQL’s and SQL Server’s native window function implementation over the TPC benchmark. Our comprehensive experimental studies demonstrate significant speedup over existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zuzarte, C., Pirahesh, H., Ma, W., Cheng, Q., Liu, L., Wong, K.: Winmagic: subquery elimination using window aggregation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2003, pp. 652–656. ACM, New York (2003)
Google Scholar
Bellamkonda, S., Ahmed, R., Witkowski, A., Amor, A., Zait, M., Lin, C.-C.: Enhanced subquery optimizations in oracle. Proc. VLDB Endow. 2(2), 1366–1377 (2009)
Google Scholar
Ben-Gan, I.: Microsoft SQL Server High-Performance T-SQL Using Window Functions. Microsoft Press (2012)
Google Scholar
Cao, Y., Bramandia, R., Chan, C.Y., Tan, K.-L.: Optimized query evaluation using cooperative sorts. In: Proceedings of the 26th International Conference on Data Engineering, ICDE, Long Beach, California, USA, pp. 601–612. IEEE (2010)
Google Scholar
Cao, Y., Chan, C.-Y., Li, J., Tan, K.-L.: Optimization of analytic window functions. Proc. VLDB Endow. 5(11), 1244–1255 (2012)
Article Google Scholar
Cao, Y., Bramandia, R., Chan, C.-Y., Tan, K.-L.: Sort-sharing-aware query processing. VLDB J. 21(3), 411–436 (2012)
Article Google Scholar
Window functions for postgresql desgin overview (2008). http://www.umitanuki.net/pgsql/wfv08/design.html
Bellamkonda, S., Bozkaya, T., Ghosh, B., Gupta, A., Haydu, J., Subramanian, S., Witkowski, A.: Analytic functions in oracle 8i. Technical report (2000)
Google Scholar
Jin, C.Q., Yi, K., Chen, L., Yu, J.X., Lin, X.M.: Sliding-window top-k queries on uncertain streams. VLDB J. 19(3), 411–435 (2010)
Article Google Scholar
Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and evaluation techniques for window aggregates in data streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2005, pp. 311–322. ACM, New York (2005)
Google Scholar
Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: No pain, no gain:efficient evaluation og sliding-window aggregates over data streams. SIGMOD Rec. 34(1), 39–44 (2005)
Article Google Scholar
Neumann, T., Moerkotte, G.: A combined framework for grouping, order optimization. In: Proceedings of the 30st International Conference on Very Large Data Bases, VLDB 2004, pp. 960–971. VLDB Endowment (2004)
Google Scholar
Simmen, D., Shekita, E., Malkemus, T.: Fundamental techniques for order optimization. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, SIGMOD 1996, pp. 57–67. ACM, NY (1996)
Google Scholar
Wang, X.Y., Chernicak, M.: Avoiding sorting, grouping in processing queries. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, pp. 826–837. VLDB Endowment (2003)
Google Scholar
Leis, V., Kan, K., Kemper, A., et al.: Efficient processing of window functions in analytical SQL queries. Proc. VLDB Endow. 8(10), 1058–1069 (2015)
Article Google Scholar
Wesley, R., Xu, F.: Incremental computation of common windowed holistic aggregates. Proc. VLDB Endow. 9(12), 1221–1232 (2016)
Article Google Scholar

Download references

Acknowledgement

This work was supported by NSFC grants (No. 61472141,U1501252 and 61021004), Shanghai Knowledge Service Platform Project (No. ZF1213), Shanghai Leading Academic Discipline Project (Project NumberB412) and Shanghai Agriculture Science Program(2015) Number 3-2.

Author information

Authors and Affiliations

Shanghai Key Laboratory of Trustworthy Computing, MOE International Joint Lab of Trustworthy Software, East China Normal University, Shanghai, China
Guangxuan Song, Jiansong Ma, Xiaoling Wang & Cheqing Jin
EMC Labs, Beijing, China
Yu Cao

Authors

Guangxuan Song
View author publications
You can also search for this author in PubMed Google Scholar
Jiansong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoling Wang
View author publications
You can also search for this author in PubMed Google Scholar
Cheqing Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yu Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoling Wang .

Editor information

Editors and Affiliations

Arizona State University , Tempe - Phoenix, Arizona, USA
Selçuk Candan
Hong Kong University of Science and Tech , Hong Kong, China
Lei Chen
Aalborg University , Aalborg, Denmark
Torben Bach Pedersen
University of New South Wales , Sydney, New South Wales, Australia
Lijun Chang
The University of Queensland , Brisbane, Queensland, Australia
Wen Hua

A Analysis of TF

Lemma 1

In ROWS model, the size of two adjacent frames in a partition differs at most one.

Proof: Let $f_{i}$ and $f_{i+1}$ ($0<i<N$) be two adjacent frames in a partition. The size difference can be expressed as $sd_i=|(f_{i+1}.t-f_{i}.t) - (f_{i+1}.h-f_{i}.h)|$. Since frame head and frame tail moves at most one step ahead at one time, $0 \leqslant f_{i+1}.h-f_{i}.h \leqslant 1$ and $0 \leqslant f_{i+1}.t-f_{i}.t \leqslant 1$, then $0 \leqslant sd_i \leqslant 1$, Lemma 1 is proved.

$G_{j}$: The j’th group in a partition.
$k_{j}$: The number of frames which use the same TF in the group $G_{j}$.

The frames are divided into groups and frames in each group share the same TTV value, which is calculated when calculating the TV of the first frame in the group. Since two partitions share nothing, their cost could be analysed separately.

For purpose of simplicity to build the cost model, we set the whole table as a partition. Then the partition size is N. We first assume each frame has the same frame size F and frame head changes for each frame. Group $G_{j}$ contains the $k_{j}+1$ frames, $f_{i}$, $f_{i+1}$, $f_{i+2}$, ... , $f_{i+k_{j}}$. Let o denotes the overlap of the group of frames from $f_{i}$ to $f_{i+k_{j}}$ ($k_{j}>0$), then $o=f_{i}\cap f_{i+1}\cap ... \cap f_{i+k_{j}}$, it’s not hard to get that o starts from $f_{i+k_{j}}.h$ and ends at $f_{i}.t$. And apparently $f_{i}.t$ is bigger than $f_{i+k_{j}}.h$, because it is utterly useless if the overlap is empty. So $s = f_{i+k_{j}}.h$ and it will not be changed until it’s out of the range of current frame. Besides, e is initialized to $f_{i_{j}}.t$ and assigned to the tail position of current frame in the following calculation. So in the end, e will be equal to $f_{i+k_{j}}.t$.

We can get that, the start position s of TF is associated with $k_{j}$. So, if we want to know s, we must confirm $k_{j}$ firstly. Next, we will give the analysis of $k_{j}$.

For a group $G_{j}$ which contains $k_{j}$+1 frames, the calculation cost consists of two components, one is related to the calculation and updating of TTV, the other one is related to the calculation of each transition value (TV). Since the TF is finally identical with the last frame, the cost on TTV is $F\times T_{CPU}$. For the first frame, all rows need to be aggregated, for other frames, approximatively only rows before the overlap (the number of the rows for $f_{i+1}$ is k-1, the number of the rows for $f_{i+2}$ is k-2, ... , the number of the rows for $f_{i+k}$ is 0. So, the sum is k(k-1)/2) and the new entering row (For every frame, there is just one row entering the frame) needs to be aggregated. So the calculation cost related to TV is $(F+k\times (k-1)/2+k)\times T_{CPU}$. For each group $G_{j}$, the calculation cost of TF-based scheme is:

$$\begin{aligned} C_{cal}=(2F+k_{j}\times \frac{k_{j}+1}{2}) \times T_{CPU} \end{aligned}$$

(3)

So, for the partition, the total calculation cost is:

$$\begin{aligned} C_{cal}=\sum _{j=1}^{t}[(2F+k_{j}\times \frac{k_{j}+1}{2}) \times T_{CPU}] \end{aligned}$$

(4)

Since the number of frames in the partition is N, we get this:

$$\begin{aligned} \sum _{j=1}^{t}(k_{j}+1) = N \end{aligned}$$

(5)

By the Formulas 4 and 5, we get this:

$$\begin{aligned} f=\frac{1}{T_{CPU}} \times C_{cal}=2Ft+\frac{1}{2} \times \sum _{j=1}^{t}({k_{j}}^{2}+k_{j}) \end{aligned}$$

(6)

$$\begin{aligned} \varphi = t +\sum _{j=1}^{t}k_{j} - N = 0 \end{aligned}$$

(7)

f takes the minimum value when $C_{cal}$ takes the minimum value. In order to minimize the $C_{cal}$. We use the $Lagrange \ Multiplier$ with Formulas 6 and 7 ($\lambda $ is the new variable):

$$\begin{aligned} L(k_{1},k_{2},...,k_{t}) = f + \lambda \varphi \end{aligned}$$

(8)

As well as:

$$\begin{aligned} L(k_{1},k_{2},...,k_{t}) = 2Ft+\frac{1}{2} \times \sum _{j=1}^{t}({k_{j}}^{2}+k_{j}) + \lambda (t +\sum _{j=1}^{t}k_{j} - N) \end{aligned}$$

(9)

Then we get the simultaneous formulae with a step partial derivative of Formula 9 by $k_{1}, k_{2}, ..., k_{t}$ and Formula 7:

$$\begin{aligned} {\left\{ \begin{array}{ll} k_{1}+\frac{1}{2}+\lambda =0\\ k_{2}+\frac{1}{2}+\lambda =0\\ k_{3}+\frac{1}{2}+\lambda =0\\ .\\ .\\ k_{t}+\frac{1}{2}+\lambda =0\\ t +\sum _{j=1}^{t}k_{j} - N = 0 \end{array}\right. } \end{aligned}$$

(10)

By the Formula 10, we get this:

$$\begin{aligned} k_{1} = k_{2} = ... = k_{t} = k = \frac{N}{t} - 1 \end{aligned}$$

(11)

We can get that, if we give the partition and it is divided into t groups ($N, F, T_{CPU}$ and t have been given), $k_{1}, k_{2}, ... , k_{t}$ is all the same when the cost $C_{cal}$ is taking the minimum. Then we take the Formula 11 into Formula 6 and get this:

$$\begin{aligned} f = 2Ft + \frac{{N}^{2}}{2t} - \frac{N}{2} \end{aligned}$$

(12)

By the Formula 12, we can get that the cost($C_{cal}$) is associated with the number of groups in a partition. So, if we want to get a minimum cost, we should set a appropriate value of t. We get the first-order derivative of the Formula 12:

$$\begin{aligned} {f}^{'} = 2F - \frac{{N}^{2}}{2} \times {t}^{-2} \end{aligned}$$

(13)

Let ${f}^{'}=0$, we get that $t=\frac{N}{2\sqrt{F}}$. Then we take $t=\frac{N}{2\sqrt{F}}$ into Formula 11 and get that $k=2\sqrt{F}-1$.

So, if we have been given a partition and it’s size is P, we set the number of groups in the partition as $\frac{P}{2\sqrt{F}}$. That means that each group contains $2\sqrt{F}$ frames. For each group, the start position of TF is ${f}_{i+\lfloor 2\sqrt{F}\rfloor -1}.h$, $s={f}_{i}.h+\lfloor 2\sqrt{F}\rfloor -1$.

In the experiments, we tried different frame sizes. From the experiment results, we can see that the TF algorithm have a good performance.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, G., Ma, J., Wang, X., Jin, C., Cao, Y. (2017). Optimizing Window Aggregate Functions in Relational Database Systems. In: Candan, S., Chen, L., Pedersen, T., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10177. Springer, Cham. https://doi.org/10.1007/978-3-319-55753-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-55753-3_22
Published: 22 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55752-6
Online ISBN: 978-3-319-55753-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimizing Window Aggregate Functions in Relational Database Systems

Abstract

Access this chapter

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Analysis of TF

A Analysis of TF

Lemma 1

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation