Skip to main content

Re-Engineering Compiler Transformations to Outperform Database Query Optimizers

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8967))

  • 912 Accesses

Abstract

Traditionally, Query Optimization and Compiler Optimizations have been developed independently. Whereas Query Optimization aims at optimizing database queries to minimize the number of disk operations, Compiler Optimizations target to maximize performance of generic executable codes. While query optimizers were originally designed for systems that needed to process large volumes of data with little main memory, the size of computer main memory has increased significantly. As a result, techniques are now being considered in the database community that have been developed in the field of compiler optimization. In this paper, we demonstrate that the converse is much more lucrative: extend compiler transformations to also target query optimization. By doing so, advanced compiler optimizations are employed as the driving force in query optimization and database systems can be on par with future complex computer architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Index sets should not be confused with database indexes and serve a different purpose. An index set is used to specify iteration in the forelem framework. Many index sets are only used in the intermediate representation and are not materialized.

  2. 2.

    A description of how to translate SQL queries to forelem will be part of a forthcoming publication.

  3. 3.

    Note that the update operation \(\mathbb {R} = \mathbb {R} \cup \ldots \) could be interpreted as an output dependency. However, the fact that \(\mathbb {R}\) is a multiset does not impose a strict order on the execution, therefore we do not consider this as a dependency.

References

  1. Allen, F.E., Cocke, J.: A program data flow analysis procedure. Commun. ACM 19(3), 137–147 (1976)

    Article  MATH  Google Scholar 

  2. Allen, J.R.: Dependence Analysis for Subscripted Variables and its Applications to Program Transformations. Ph.D. Dissertation, Rice University (1983)

    Google Scholar 

  3. Allen, R., Kennedy, K.: Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst. 9, 491–542 (1987)

    Article  MATH  Google Scholar 

  4. Andrade, H., Aryangat, S., Kurç, T.M., Saltz, J.H., Sussman, A.: Efficient execution of multi-query data analysis batches using compiler optimization strategies. In: LCPC, pp. 509–524 (2003)

    Google Scholar 

  5. Byler, M., Wolfe, M., Davies, J.R.B., Huson, C., Leasure, B.: Multiple version loops. In: ICPP, pp. 312–318 (1987)

    Google Scholar 

  6. Cloudera: Impala, August 2014. http://impala.io/

  7. Fursin, G.G., O’Boyle, M., Knijnenburg, P.M.W.: Evaluating iterative compilation. In: Pugh, B., Tseng, C.-W. (eds.) LCPC 2002. LNCS, vol. 2481, pp. 362–376. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Kang, M.H., Dietz, H.G., Bhargava, B.K.: Multiple-query optimization at algorithm-level. Data Knowl. Eng. 14(1), 57–75 (1994)

    Article  Google Scholar 

  9. Kennedy, K.: A survey of data flow analysis techniques, Muchnik, S.S., and Jones, N.D. (eds.), Program Flow Analysis: Theory and Applications, pp. 5–54. Prentice-Hall, Englewood Cliffs (1981)

    Google Scholar 

  10. Kennedy, K., McKinley, K.: Maximizing loop parallelism and improving data locality via loop fusion and distribution. In: Banerjee, U., Gelernter, D., Nicolau, Alexandru, Padua, David A. (eds.) LCPC 1993. LNCS, vol. 768. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  11. Knijnenburg, P., Kisuki, T., O’Boyle, M.: Combined selection of tile sizes and unroll factors using iterative compilation. J. Supercomput. 24(1), 43–67 (2003)

    Article  MATH  Google Scholar 

  12. Krikellas, K., Viglas, S., Cintra, M.: Generating code for holistic query evaluation. In: ICDE, pp. 613–624 (2010)

    Google Scholar 

  13. Kuck, D.J., Kuhn, R.H., Padua, D.A., Leasure, B., Wolfe, M.: Dependence graphs and compiler optimizations. In: Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1981, pp. 207–218. ACM, New York (1981)

    Google Scholar 

  14. Lam, M.D., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. SIGARCH Comput. Archit. News 19(2), 63–74 (1991)

    Article  Google Scholar 

  15. Lieuwen, D.F.: Parallelizing loops in database programming languages. In: ICDE, pp. 86–93 (1998)

    Google Scholar 

  16. Lieuwen, D.F., DeWitt, D.J.: A transformation-based approach to optimizing loops in database programming languages. In: SIGMOD Conference, pp. 91–100 (1992)

    Google Scholar 

  17. MonetDB Project: MonetDB, February 2013. http://www.monetdb.org/

  18. Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 4, 539–550 (2011)

    Article  Google Scholar 

  19. PostgreSQL Project: PostgreSQL: The world’s most advanced open source database, February 2013. http://www.postgresql.org/

  20. Rietveld, K.F.D., Wijshoff, H.A.G.: Forelem: A versatile optimization framework for tuple-based computations. In: CPC 2013: 17th Workshop on Compilers for Parallel Computing, July 2013

    Google Scholar 

  21. Temam, O., Granston, E., Jalby, W.: To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In: Proceedings of the Supercomputing 1993, pp. 410–419 (1993)

    Google Scholar 

  22. Transaction Processing Performance Council: TPC-H, May 2009. http://tpc.org/tpch/default.asp

  23. Yach, D.P., Graham, J.D., Scian, A.F.: Database system with methodology for accessing a database from portable devices. US Patent #6341288, Jan 2002

    Google Scholar 

  24. Zima, H., Chapman, B.: Supercompilers for Parallel and Vector Computers. ACM, New York (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristian F. D. Rietveld .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Rietveld, K.F.D., Wijshoff, H.A.G. (2015). Re-Engineering Compiler Transformations to Outperform Database Query Optimizers. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17473-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17472-3

  • Online ISBN: 978-3-319-17473-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics