Skip to main content

A Coq Formalisation of SQL’s Execution Engines

  • Conference paper
  • First Online:
Book cover Interactive Theorem Proving (ITP 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10895))

Included in the following conference series:

Abstract

In this article, we use the Coq proof assistant to specify and verify the low level layer of SQL’s execution engines. To reach our goals, we first design a high-level Coq specification for data-centric operators intended to capture their essence. We, then, provide two Coq implementations of our specification. The first one, the physical algebra, consists in the low level operators found in systems such as Postgresql or Oracle. The second, SQL algebra, is an extended relational algebra that provides a semantics for SQL. Last, we formally relate physical algebra and SQL algebra. By proving that the physical algebra implements SQL algebra, we give high level assurances that physical algebraic and SQL algebra expressions enjoy the same semantics. All this yields the first, to our best knowledge, formalisation and verification of the low level layer of an RDBMS as well as SQL’s compilation’s physical optimisation: fundamental steps towards mechanising SQL’s compilation chain.

Work funded by the DataCert ANR project: ANR-15-CE39-0009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The model exploits system collected statistics about the data stored in the database.

  2. 2.

    The IJ nodes are expressed in Postqresql as Nested loop combined with an Index scan but corresponds to an index-based join.

  3. 3.

    will be according to the various types of elements and various implementations for the collection. A particular case of is which denotes the number of occurrences in a list.

  4. 4.

    We could also use a module type, but the syntax would be heavier and less general.

  5. 5.

    This construction is similar to the exception monad. There is no interest to write the standard “return” and “bind” operators. The sequential scan and nested loop, respecitvely, can be seen as online versions of them.

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Boston (1995)

    MATH  Google Scholar 

  2. Anand, A., Appel, A., Morrisett, G., Paraskevopoulou, Z., Pollack, R., Bélanger-Savary, O., Sozeau, M., Weaver, M.: Certicoq: a verified compiler for Coq. In: The Third International Workshop on Coq for Programming Languages (CoqPL) (2017)

    Google Scholar 

  3. Auerbach, J.S., Hirzel, M., Mandel, L., Shinnar, A., Siméon, J.: Handling environments in a nested relational algebra with combinators and an implementation in a verified query compiler. In: Salihoglu, S., Zhou, W., Chirkova, R., Yang, J., Suciu, D. (eds.) Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, 14–19 May 2017, pp. 1555–1569. ACM (2017). https://doi.org/10.1145/3035918.3035961, http://doi.acm.org/10.1145/3035918.3035961

  4. Auerbach, J.S., Hirzel, M., Mandel, L., Shinnar, A., Siméon, J.: Q*cert: a platform for implementing and verifying query compilers. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, 14–19 May 2017, pp. 1703–1706 (2017)

    Google Scholar 

  5. Bailis, P., Hellerstein, J.M., Stonebraker, M. (eds.): Readings in Database Systems, 5th edn. MIT-Press (2015). http://www.redbook.io/

  6. Benzaken, V., Contejean, E.: SQLCert: Coq mechanisation of SQL’s compilation: formally reconciling SQL and (relational) algebra, October 2016. Working paper available on demand

    Google Scholar 

  7. Benzaken, V., Contejean, E.: A Coq mechanised executable algebraic semantics for real life SQL queries (2018, Submitted for Publication)

    Google Scholar 

  8. Benzaken, V., Contejean, E., Dumbrava, S.: A Coq formalization of the relational data model. In: 23rd European Symposium on Programming (ESOP) (2014)

    Google Scholar 

  9. Benzaken, V., Contejean, É., Dumbrava, S.: Certifying standard and stratified datalog inference engines in SSReflect. In: Ayala-Rincón, M., Muñoz, C.A. (eds.) ITP 2017. LNCS, vol. 10499, pp. 171–188. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66107-0_12

    Chapter  Google Scholar 

  10. Chamberlin, D.D., Boyce, R.F.: SEQUEL: a structured English query language. In: Rustin, R. (ed.) Proceedings of 1974 ACM-SIGMOD Workshop on Data Description, Access and Control, Ann Arbor, Michigan, 1–3 May 1974, 2 vols., pp. 249–264. ACM (1974). https://doi.org/10.1145/800296.811515, http://doi.acm.org/10.1145/800296.811515

  11. Chen, H., Wu, X.N., Shao, Z., Lockerman, J., Gu, R.: Toward compositional verification of interruptible OS kernels and device drivers. In: Krintz, C., Berger, E. (eds.) Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, 13–17 June 2016, pp. 431–447. ACM (2016).https://doi.org/10.1145/2908080.2908101, http://doi.acm.org/10.1145/2908080.2908101

  12. Chu, S., Weitz, K., Cheung, A., Suciu, D.: HoTTSQL: proving query rewrites with univalent SQL semantics. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, pp. 510–524. ACM, New York (2017)

    Google Scholar 

  13. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970). https://doi.org/10.1145/362384.362685, http://doi.acm.org/10.1145/362384.362685

  14. Delaware, B., Pit-Claudel, C., Gross, J., Chlipala, A.: Fiat: Deductive synthesis of abstract data types in a proof assistant. In: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, pp. 689–700 (2015)

    Google Scholar 

  15. Elmasri, R., Navathe, S.B.: Fundamentals of Database Systems, 2nd edn. Benjamin/Cummings, Redwood City (1994)

    MATH  Google Scholar 

  16. Filliâtre, J.-C., Paskevich, A.: Why3 — where programs meet provers. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 125–128. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37036-6_8

    Chapter  Google Scholar 

  17. Filliâtre, J.C., Pereira, M.: Itérer avec confiance. In: Journées Francophones des Langages Applicatifs. Saint-Malo, France, January 2016. https://hal.inria.fr/hal-01240891

  18. Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems - The Complete Book, 2nd edn. Pearson Education, Harlow (2009)

    Google Scholar 

  19. Gonzalía, C.: Towards a formalisation of relational database theory in constructive type theory. In: Berghammer, R., Möller, B., Struth, G. (eds.) RelMiCS 2003. LNCS, vol. 3051, pp. 137–148. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24771-5_12

    Chapter  MATH  Google Scholar 

  20. Gonzalia, C.: Relations in dependent type theory. Ph.D. thesis, Chalmers Göteborg University (2006)

    Google Scholar 

  21. Gu, R., Shao, Z., Chen, H., Wu, X.N., Kim, J., Sjöberg, V., Costanzo, D.: CertiKOS: an extensible architecture for building certified concurrent OS kernels. In: Keeton, K., Roscoe, T. (eds.) 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, 2–4 November 2016, pp. 653–669. USENIX Association (2016). https://www.usenix.org/conference/osdi16/technical-sessions/presentation/gu

  22. Karp, R.M.: On-line algorithms versus off-line algorithms: how much is it worth to know the future? In: van Leeuwen, J. (ed.) Algorithms, Software, Architecture - Information Processing 1992, vol. 1, Proceedings of the IFIP 12th World Computer Congress, Madrid, Spain, 7–11 September 1992. IFIP Transactions, vol. A-12, pp. 416–429. North-Holland (1992)

    Google Scholar 

  23. Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43(4), 363–446 (2009)

    Article  MathSciNet  Google Scholar 

  24. Malecha, G., Morrisett, G., Shinnar, A., Wisnesky, R.: Toward a verified relational database management system. In: ACM International Conference on POPL (2010)

    Google Scholar 

  25. Ramakrishnan, R., Gehrke, J.: Database Management Systems, 3rd edn. McGraw-Hill, New York (2003)

    MATH  Google Scholar 

  26. Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, Boston, Massachusetts, 30 May–1 June 1979, pp. 23–34 (1979)

    Google Scholar 

  27. The Coq Development Team: The Coq Proof Assistant Reference Manual (2010). http://coq.inria.fr, http://coq.inria.fr

  28. The Isabelle Development Team: The Isabelle Interactive Theorem Prover (2010). https://isabelle.in.tum.de/, https://isabelle.in.tum.de/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to É. Contejean .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Benzaken, V., Contejean, É., Keller, C., Martins, E. (2018). A Coq Formalisation of SQL’s Execution Engines. In: Avigad, J., Mahboubi, A. (eds) Interactive Theorem Proving. ITP 2018. Lecture Notes in Computer Science(), vol 10895. Springer, Cham. https://doi.org/10.1007/978-3-319-94821-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94821-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94820-1

  • Online ISBN: 978-3-319-94821-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics