Abstract
The window function has become an important OLAP extension of SQL since SQL:2003, and is supported by major commercial RDBMSs (e.g. Oracle, DB2, SQL Server, Teradata and Pivotal Greenplum) and by emerging Big Data platforms (e.g. Google Tenzing, Apache Hive, Pivotal HAWQ and Cloudera Impala). Window functions are designed for advanced data analytics use cases, bringing significant functional and performance enhancements to OLAP and decision support applications. However, we identify that existing window function evaluation approaches are still with significant room for improvement. In this paper, we revisit the conventional two-phase evaluation framework for window functions in relational databases, and propose several novel optimization techniques which aim to minimize the redundant data accesses and computations during the function calls invoked over window frames. We have integrated the proposed techniques into PostgreSQL, and compared them with both PostgreSQL’s and SQL Server’s native window function implementation over the TPC benchmark. Our comprehensive experimental studies demonstrate significant speedup over existing approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zuzarte, C., Pirahesh, H., Ma, W., Cheng, Q., Liu, L., Wong, K.: Winmagic: subquery elimination using window aggregation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2003, pp. 652–656. ACM, New York (2003)
Bellamkonda, S., Ahmed, R., Witkowski, A., Amor, A., Zait, M., Lin, C.-C.: Enhanced subquery optimizations in oracle. Proc. VLDB Endow. 2(2), 1366–1377 (2009)
Ben-Gan, I.: Microsoft SQL Server High-Performance T-SQL Using Window Functions. Microsoft Press (2012)
Cao, Y., Bramandia, R., Chan, C.Y., Tan, K.-L.: Optimized query evaluation using cooperative sorts. In: Proceedings of the 26th International Conference on Data Engineering, ICDE, Long Beach, California, USA, pp. 601–612. IEEE (2010)
Cao, Y., Chan, C.-Y., Li, J., Tan, K.-L.: Optimization of analytic window functions. Proc. VLDB Endow. 5(11), 1244–1255 (2012)
Cao, Y., Bramandia, R., Chan, C.-Y., Tan, K.-L.: Sort-sharing-aware query processing. VLDB J. 21(3), 411–436 (2012)
Window functions for postgresql desgin overview (2008). http://www.umitanuki.net/pgsql/wfv08/design.html
Bellamkonda, S., Bozkaya, T., Ghosh, B., Gupta, A., Haydu, J., Subramanian, S., Witkowski, A.: Analytic functions in oracle 8i. Technical report (2000)
Jin, C.Q., Yi, K., Chen, L., Yu, J.X., Lin, X.M.: Sliding-window top-k queries on uncertain streams. VLDB J. 19(3), 411–435 (2010)
Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and evaluation techniques for window aggregates in data streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2005, pp. 311–322. ACM, New York (2005)
Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: No pain, no gain:efficient evaluation og sliding-window aggregates over data streams. SIGMOD Rec. 34(1), 39–44 (2005)
Neumann, T., Moerkotte, G.: A combined framework for grouping, order optimization. In: Proceedings of the 30st International Conference on Very Large Data Bases, VLDB 2004, pp. 960–971. VLDB Endowment (2004)
Simmen, D., Shekita, E., Malkemus, T.: Fundamental techniques for order optimization. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, SIGMOD 1996, pp. 57–67. ACM, NY (1996)
Wang, X.Y., Chernicak, M.: Avoiding sorting, grouping in processing queries. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, pp. 826–837. VLDB Endowment (2003)
Leis, V., Kan, K., Kemper, A., et al.: Efficient processing of window functions in analytical SQL queries. Proc. VLDB Endow. 8(10), 1058–1069 (2015)
Wesley, R., Xu, F.: Incremental computation of common windowed holistic aggregates. Proc. VLDB Endow. 9(12), 1221–1232 (2016)
Acknowledgement
This work was supported by NSFC grants (No. 61472141,U1501252 and 61021004), Shanghai Knowledge Service Platform Project (No. ZF1213), Shanghai Leading Academic Discipline Project (Project NumberB412) and Shanghai Agriculture Science Program(2015) Number 3-2.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Analysis of TF
A Analysis of TF
Lemma 1
In ROWS model, the size of two adjacent frames in a partition differs at most one.
Proof: Let \(f_{i}\) and \(f_{i+1}\) (\(0<i<N\)) be two adjacent frames in a partition. The size difference can be expressed as \(sd_i=|(f_{i+1}.t-f_{i}.t) - (f_{i+1}.h-f_{i}.h)|\). Since frame head and frame tail moves at most one step ahead at one time, \(0 \leqslant f_{i+1}.h-f_{i}.h \leqslant 1\) and \(0 \leqslant f_{i+1}.t-f_{i}.t \leqslant 1\), then \(0 \leqslant sd_i \leqslant 1\), Lemma 1 is proved.
-
\(G_{j}\): The j’th group in a partition.
-
\(k_{j}\): The number of frames which use the same TF in the group \(G_{j}\).
The frames are divided into groups and frames in each group share the same TTV value, which is calculated when calculating the TV of the first frame in the group. Since two partitions share nothing, their cost could be analysed separately.
For purpose of simplicity to build the cost model, we set the whole table as a partition. Then the partition size is N. We first assume each frame has the same frame size F and frame head changes for each frame. Group \(G_{j}\) contains the \(k_{j}+1\) frames, \(f_{i}\), \(f_{i+1}\), \(f_{i+2}\), ... , \(f_{i+k_{j}}\). Let o denotes the overlap of the group of frames from \(f_{i}\) to \(f_{i+k_{j}}\) (\(k_{j}>0\)), then \(o=f_{i}\cap f_{i+1}\cap ... \cap f_{i+k_{j}}\), it’s not hard to get that o starts from \(f_{i+k_{j}}.h\) and ends at \(f_{i}.t\). And apparently \(f_{i}.t\) is bigger than \(f_{i+k_{j}}.h\), because it is utterly useless if the overlap is empty. So \(s = f_{i+k_{j}}.h\) and it will not be changed until it’s out of the range of current frame. Besides, e is initialized to \(f_{i_{j}}.t\) and assigned to the tail position of current frame in the following calculation. So in the end, e will be equal to \(f_{i+k_{j}}.t\).
We can get that, the start position s of TF is associated with \(k_{j}\). So, if we want to know s, we must confirm \(k_{j}\) firstly. Next, we will give the analysis of \(k_{j}\).
For a group \(G_{j}\) which contains \(k_{j}\)+1 frames, the calculation cost consists of two components, one is related to the calculation and updating of TTV, the other one is related to the calculation of each transition value (TV). Since the TF is finally identical with the last frame, the cost on TTV is \(F\times T_{CPU}\). For the first frame, all rows need to be aggregated, for other frames, approximatively only rows before the overlap (the number of the rows for \(f_{i+1}\) is k-1, the number of the rows for \(f_{i+2}\) is k-2, ... , the number of the rows for \(f_{i+k}\) is 0. So, the sum is k(k-1)/2) and the new entering row (For every frame, there is just one row entering the frame) needs to be aggregated. So the calculation cost related to TV is \((F+k\times (k-1)/2+k)\times T_{CPU}\). For each group \(G_{j}\), the calculation cost of TF-based scheme is:
So, for the partition, the total calculation cost is:
Since the number of frames in the partition is N, we get this:
By the Formulas 4 and 5, we get this:
f takes the minimum value when \(C_{cal}\) takes the minimum value. In order to minimize the \(C_{cal}\). We use the \(Lagrange \ Multiplier\) with Formulas 6 and 7 (\(\lambda \) is the new variable):
As well as:
Then we get the simultaneous formulae with a step partial derivative of Formula 9 by \(k_{1}, k_{2}, ..., k_{t}\) and Formula 7:
By the Formula 10, we get this:
We can get that, if we give the partition and it is divided into t groups (\(N, F, T_{CPU}\) and t have been given), \(k_{1}, k_{2}, ... , k_{t}\) is all the same when the cost \(C_{cal}\) is taking the minimum. Then we take the Formula 11 into Formula 6 and get this:
By the Formula 12, we can get that the cost(\(C_{cal}\)) is associated with the number of groups in a partition. So, if we want to get a minimum cost, we should set a appropriate value of t. We get the first-order derivative of the Formula 12:
Let \({f}^{'}=0\), we get that \(t=\frac{N}{2\sqrt{F}}\). Then we take \(t=\frac{N}{2\sqrt{F}}\) into Formula 11 and get that \(k=2\sqrt{F}-1\).
So, if we have been given a partition and it’s size is P, we set the number of groups in the partition as \(\frac{P}{2\sqrt{F}}\). That means that each group contains \(2\sqrt{F}\) frames. For each group, the start position of TF is \({f}_{i+\lfloor 2\sqrt{F}\rfloor -1}.h\), \(s={f}_{i}.h+\lfloor 2\sqrt{F}\rfloor -1\).
In the experiments, we tried different frame sizes. From the experiment results, we can see that the TF algorithm have a good performance.
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Song, G., Ma, J., Wang, X., Jin, C., Cao, Y. (2017). Optimizing Window Aggregate Functions in Relational Database Systems. In: Candan, S., Chen, L., Pedersen, T., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10177. Springer, Cham. https://doi.org/10.1007/978-3-319-55753-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-55753-3_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55752-6
Online ISBN: 978-3-319-55753-3
eBook Packages: Computer ScienceComputer Science (R0)