Advertisement

Re-Engineering Compiler Transformations to Outperform Database Query Optimizers

  • Kristian F. D. RietveldEmail author
  • Harry A. G. Wijshoff
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8967)

Abstract

Traditionally, Query Optimization and Compiler Optimizations have been developed independently. Whereas Query Optimization aims at optimizing database queries to minimize the number of disk operations, Compiler Optimizations target to maximize performance of generic executable codes. While query optimizers were originally designed for systems that needed to process large volumes of data with little main memory, the size of computer main memory has increased significantly. As a result, techniques are now being considered in the database community that have been developed in the field of compiler optimization. In this paper, we demonstrate that the converse is much more lucrative: extend compiler transformations to also target query optimization. By doing so, advanced compiler optimizations are employed as the driving force in query optimization and database systems can be on par with future complex computer architectures.

Keywords

Loop Nest Execution Plan Query Optimization Compiler Optimization Abstract Syntax Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Allen, F.E., Cocke, J.: A program data flow analysis procedure. Commun. ACM 19(3), 137–147 (1976)CrossRefzbMATHGoogle Scholar
  2. 2.
    Allen, J.R.: Dependence Analysis for Subscripted Variables and its Applications to Program Transformations. Ph.D. Dissertation, Rice University (1983)Google Scholar
  3. 3.
    Allen, R., Kennedy, K.: Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst. 9, 491–542 (1987)CrossRefzbMATHGoogle Scholar
  4. 4.
    Andrade, H., Aryangat, S., Kurç, T.M., Saltz, J.H., Sussman, A.: Efficient execution of multi-query data analysis batches using compiler optimization strategies. In: LCPC, pp. 509–524 (2003)Google Scholar
  5. 5.
    Byler, M., Wolfe, M., Davies, J.R.B., Huson, C., Leasure, B.: Multiple version loops. In: ICPP, pp. 312–318 (1987)Google Scholar
  6. 6.
    Cloudera: Impala, August 2014. http://impala.io/
  7. 7.
    Fursin, G.G., O’Boyle, M., Knijnenburg, P.M.W.: Evaluating iterative compilation. In: Pugh, B., Tseng, C.-W. (eds.) LCPC 2002. LNCS, vol. 2481, pp. 362–376. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  8. 8.
    Kang, M.H., Dietz, H.G., Bhargava, B.K.: Multiple-query optimization at algorithm-level. Data Knowl. Eng. 14(1), 57–75 (1994)CrossRefGoogle Scholar
  9. 9.
    Kennedy, K.: A survey of data flow analysis techniques, Muchnik, S.S., and Jones, N.D. (eds.), Program Flow Analysis: Theory and Applications, pp. 5–54. Prentice-Hall, Englewood Cliffs (1981)Google Scholar
  10. 10.
    Kennedy, K., McKinley, K.: Maximizing loop parallelism and improving data locality via loop fusion and distribution. In: Banerjee, U., Gelernter, D., Nicolau, Alexandru, Padua, David A. (eds.) LCPC 1993. LNCS, vol. 768. Springer, Heidelberg (1994) CrossRefGoogle Scholar
  11. 11.
    Knijnenburg, P., Kisuki, T., O’Boyle, M.: Combined selection of tile sizes and unroll factors using iterative compilation. J. Supercomput. 24(1), 43–67 (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Krikellas, K., Viglas, S., Cintra, M.: Generating code for holistic query evaluation. In: ICDE, pp. 613–624 (2010)Google Scholar
  13. 13.
    Kuck, D.J., Kuhn, R.H., Padua, D.A., Leasure, B., Wolfe, M.: Dependence graphs and compiler optimizations. In: Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1981, pp. 207–218. ACM, New York (1981)Google Scholar
  14. 14.
    Lam, M.D., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. SIGARCH Comput. Archit. News 19(2), 63–74 (1991)CrossRefGoogle Scholar
  15. 15.
    Lieuwen, D.F.: Parallelizing loops in database programming languages. In: ICDE, pp. 86–93 (1998)Google Scholar
  16. 16.
    Lieuwen, D.F., DeWitt, D.J.: A transformation-based approach to optimizing loops in database programming languages. In: SIGMOD Conference, pp. 91–100 (1992)Google Scholar
  17. 17.
    MonetDB Project: MonetDB, February 2013. http://www.monetdb.org/
  18. 18.
    Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 4, 539–550 (2011)CrossRefGoogle Scholar
  19. 19.
    PostgreSQL Project: PostgreSQL: The world’s most advanced open source database, February 2013. http://www.postgresql.org/
  20. 20.
    Rietveld, K.F.D., Wijshoff, H.A.G.: Forelem: A versatile optimization framework for tuple-based computations. In: CPC 2013: 17th Workshop on Compilers for Parallel Computing, July 2013Google Scholar
  21. 21.
    Temam, O., Granston, E., Jalby, W.: To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In: Proceedings of the Supercomputing 1993, pp. 410–419 (1993)Google Scholar
  22. 22.
    Transaction Processing Performance Council: TPC-H, May 2009. http://tpc.org/tpch/default.asp
  23. 23.
    Yach, D.P., Graham, J.D., Scian, A.F.: Database system with methodology for accessing a database from portable devices. US Patent #6341288, Jan 2002Google Scholar
  24. 24.
    Zima, H., Chapman, B.: Supercompilers for Parallel and Vector Computers. ACM, New York (1991)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Kristian F. D. Rietveld
    • 1
    Email author
  • Harry A. G. Wijshoff
    • 1
  1. 1.LIACSLeiden UniversityLeidenThe Netherlands

Personalised recommendations