Re-Engineering Compiler Transformations to Outperform Database Query Optimizers

Rietveld, Kristian F. D.; Wijshoff, Harry A. G.

doi:10.1007/978-3-319-17473-0_20

Kristian F. D. Rietveld¹⁵ &
Harry A. G. Wijshoff¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8967))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

912 Accesses

Abstract

Traditionally, Query Optimization and Compiler Optimizations have been developed independently. Whereas Query Optimization aims at optimizing database queries to minimize the number of disk operations, Compiler Optimizations target to maximize performance of generic executable codes. While query optimizers were originally designed for systems that needed to process large volumes of data with little main memory, the size of computer main memory has increased significantly. As a result, techniques are now being considered in the database community that have been developed in the field of compiler optimization. In this paper, we demonstrate that the converse is much more lucrative: extend compiler transformations to also target query optimization. By doing so, advanced compiler optimizations are employed as the driving force in query optimization and database systems can be on par with future complex computer architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Index sets should not be confused with database indexes and serve a different purpose. An index set is used to specify iteration in the forelem framework. Many index sets are only used in the intermediate representation and are not materialized.
2.
A description of how to translate SQL queries to forelem will be part of a forthcoming publication.
3.
Note that the update operation \(\mathbb {R} = \mathbb {R} \cup \ldots \) could be interpreted as an output dependency. However, the fact that \(\mathbb {R}\) is a multiset does not impose a strict order on the execution, therefore we do not consider this as a dependency.

References

Allen, F.E., Cocke, J.: A program data flow analysis procedure. Commun. ACM 19(3), 137–147 (1976)
Article MATH Google Scholar
Allen, J.R.: Dependence Analysis for Subscripted Variables and its Applications to Program Transformations. Ph.D. Dissertation, Rice University (1983)
Google Scholar
Allen, R., Kennedy, K.: Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst. 9, 491–542 (1987)
Article MATH Google Scholar
Andrade, H., Aryangat, S., Kurç, T.M., Saltz, J.H., Sussman, A.: Efficient execution of multi-query data analysis batches using compiler optimization strategies. In: LCPC, pp. 509–524 (2003)
Google Scholar
Byler, M., Wolfe, M., Davies, J.R.B., Huson, C., Leasure, B.: Multiple version loops. In: ICPP, pp. 312–318 (1987)
Google Scholar
Cloudera: Impala, August 2014. http://impala.io/
Fursin, G.G., O’Boyle, M., Knijnenburg, P.M.W.: Evaluating iterative compilation. In: Pugh, B., Tseng, C.-W. (eds.) LCPC 2002. LNCS, vol. 2481, pp. 362–376. Springer, Heidelberg (2005)
Chapter Google Scholar
Kang, M.H., Dietz, H.G., Bhargava, B.K.: Multiple-query optimization at algorithm-level. Data Knowl. Eng. 14(1), 57–75 (1994)
Article Google Scholar
Kennedy, K.: A survey of data flow analysis techniques, Muchnik, S.S., and Jones, N.D. (eds.), Program Flow Analysis: Theory and Applications, pp. 5–54. Prentice-Hall, Englewood Cliffs (1981)
Google Scholar
Kennedy, K., McKinley, K.: Maximizing loop parallelism and improving data locality via loop fusion and distribution. In: Banerjee, U., Gelernter, D., Nicolau, Alexandru, Padua, David A. (eds.) LCPC 1993. LNCS, vol. 768. Springer, Heidelberg (1994)
Chapter Google Scholar
Knijnenburg, P., Kisuki, T., O’Boyle, M.: Combined selection of tile sizes and unroll factors using iterative compilation. J. Supercomput. 24(1), 43–67 (2003)
Article MATH Google Scholar
Krikellas, K., Viglas, S., Cintra, M.: Generating code for holistic query evaluation. In: ICDE, pp. 613–624 (2010)
Google Scholar
Kuck, D.J., Kuhn, R.H., Padua, D.A., Leasure, B., Wolfe, M.: Dependence graphs and compiler optimizations. In: Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1981, pp. 207–218. ACM, New York (1981)
Google Scholar
Lam, M.D., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. SIGARCH Comput. Archit. News 19(2), 63–74 (1991)
Article Google Scholar
Lieuwen, D.F.: Parallelizing loops in database programming languages. In: ICDE, pp. 86–93 (1998)
Google Scholar
Lieuwen, D.F., DeWitt, D.J.: A transformation-based approach to optimizing loops in database programming languages. In: SIGMOD Conference, pp. 91–100 (1992)
Google Scholar
MonetDB Project: MonetDB, February 2013. http://www.monetdb.org/
Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 4, 539–550 (2011)
Article Google Scholar
PostgreSQL Project: PostgreSQL: The world’s most advanced open source database, February 2013. http://www.postgresql.org/
Rietveld, K.F.D., Wijshoff, H.A.G.: Forelem: A versatile optimization framework for tuple-based computations. In: CPC 2013: 17th Workshop on Compilers for Parallel Computing, July 2013
Google Scholar
Temam, O., Granston, E., Jalby, W.: To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In: Proceedings of the Supercomputing 1993, pp. 410–419 (1993)
Google Scholar
Transaction Processing Performance Council: TPC-H, May 2009. http://tpc.org/tpch/default.asp
Yach, D.P., Graham, J.D., Scian, A.F.: Database system with methodology for accessing a database from portable devices. US Patent #6341288, Jan 2002
Google Scholar
Zima, H., Chapman, B.: Supercompilers for Parallel and Vector Computers. ACM, New York (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

LIACS, Leiden University, Leiden, The Netherlands
Kristian F. D. Rietveld & Harry A. G. Wijshoff

Authors

Kristian F. D. Rietveld
View author publications
You can also search for this author in PubMed Google Scholar
Harry A. G. Wijshoff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kristian F. D. Rietveld .

Editor information

Editors and Affiliations

Intel Corporation, Santa Clara, California, USA
James Brodman
Intel Corporation, Santa Clara, California, USA
Peng Tu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rietveld, K.F.D., Wijshoff, H.A.G. (2015). Re-Engineering Compiler Transformations to Outperform Database Query Optimizers. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-17473-0_20
Published: 01 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17472-3
Online ISBN: 978-3-319-17473-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Re-Engineering Compiler Transformations to Outperform Database Query Optimizers