Skip to main content

Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12077)

Abstract

We present semantic correctness proofs of Automatic Differentiation (AD). We consider a forward-mode AD method on a higher order language with algebraic data types, and we characterise it as the unique structure preserving macro given a choice of derivatives for basic operations. We describe a rich semantics for differentiable programming, based on diffeological spaces. We show that it interprets our language, and we phrase what it means for the AD method to be correct with respect to this semantics. We show that our characterisation of AD gives rise to an elegant semantic proof of its correctness based on a gluing construction on diffeological spaces. We explain how this is, in essence, a logical relations argument. Finally, we sketch how the analysis extends to other AD methods by considering a continuation-based method.

M. Huot, S. Staton and M. Vákár—Equal contribution.

References

  1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). pp. 265–283 (2016)

    Google Scholar 

  2. Abadi, M., Plotkin, G.D.: A simple differentiable programming language. In: Proc. POPL 2020. ACM (2020)

    Google Scholar 

  3. Baez, J., Hoffnung, A.: Convenient categories of smooth spaces. Transactions of the American Mathematical Society 363(11), 5789–5825 (2011)

    Google Scholar 

  4. Barthe, G., Crubillé, R., Lago, U.D., Gavazzo, F.: On the versatility of open logical relations: Continuity, automatic differentiation, and a containment theorem. In: Proc. ESOP 2020. Springer (2020), to appear

    Google Scholar 

  5. Brunel, A., Mazza, D., Pagani, M.: Backpropagation in the simply typed lambda-calculus with linear negation. In: Proc. POPL 2020 (2020)

    Google Scholar 

  6. Carpenter, B., Hoffman, M.D., Brubaker, M., Lee, D., Li, P., Betancourt, M.: The Stan math library: Reverse-mode automatic differentiation in C++. arXiv preprint arXiv:1509.07164 (2015)

  7. Christensen, J.D., Wu, E.: Tangent spaces and tangent bundles for diffeological spaces. arXiv preprint arXiv:1411.5425 (2014)

    Google Scholar 

  8. Cockett, J.R.B., Cruttwell, G.S.H., Gallagher, J., Lemay, J.S.P., MacAdam, B., Plotkin, G.D., Pronk, D.: Reverse derivative categories. In: Proc. CSL 2020 (2020)

    Google Scholar 

  9. Cruttwell, G., Gallagher, J., MacAdam, B.: Towards formalizing and extending differential programming using tangent categories. In: Proc. ACT 2019 (2019)

    Google Scholar 

  10. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12(Jul), 2121–2159 (2011)

    Google Scholar 

  11. Ehrhard, T., Regnier, L.: The differential lambda-calculus. Theoretical Computer Science 309(1-3), 1–41 (2003)

    Google Scholar 

  12. Elliott, C.: The simple essence of automatic differentiation. Proceedings of the ACM on Programming Languages 2(ICFP),  70 (2018)

    Google Scholar 

  13. Fong, B., Spivak, D., Tuyéras, R.: Backprop as functor: A compositional perspective on supervised learning. In: 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). pp. 1–13. IEEE (2019)

    Google Scholar 

  14. Hoffman, M.D., Gelman, A.: The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15(1), 1593–1623 (2014)

    Google Scholar 

  15. Huot, M., Staton, S., Vákár, M.: Correctness of automatic differentiation via diffeologies and categorical gluing. Full version (2020), arxiv:2001.02209

    Google Scholar 

  16. Iglesias-Zemmour, P.: Diffeology. American Mathematical Soc. (2013)

    Google Scholar 

  17. Johnstone, P.T., Lack, S., Sobocinski, P.: Quasitoposes, quasiadhesive categories and Artin glueing. In: Proc. CALCO 2007 (2007)

    Google Scholar 

  18. Kiefer, J., Wolfowitz, J., et al.: Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics 23(3), 462–466 (1952)

    Google Scholar 

  19. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arxiv:1412.6980 (2014)

    Google Scholar 

  20. Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., Blei, D.M.: Automatic differentiation variational inference. The Journal of Machine Learning Research 18(1), 430–474 (2017)

    Google Scholar 

  21. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989)

    Google Scholar 

  22. Mak, C., Ong, L.: A differential-form pullback programming language for higher-order reverse-mode automatic differentiation (2020), arxiv:2002.08241

    Google Scholar 

  23. Manzyuk, O.: A simply typed \(\lambda \)-calculus of forward automatic differentiation. In: Proc. MFPS 2012 (2012)

    Google Scholar 

  24. Mitchell, J.C., Scedrov, A.: Notes on sconing and relators. In: International Workshop on Computer Science Logic. pp. 352–378. Springer (1992)

    Google Scholar 

  25. Neal, R.M., et al.: MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo 2(11),  2 (2011)

    Google Scholar 

  26. Pearlmutter, B.A., Siskind, J.M.: Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator. ACM Transactions on Programming Languages and Systems (TOPLAS) 30(2),  7 (2008)

    Google Scholar 

  27. Pitts, A.M.: Categorical logic. Tech. rep., University of Cambridge, Computer Laboratory (1995)

    Google Scholar 

  28. Plotkin, G.D.: Some principles of differential programming languages (2018), invited talk, POPL 2018

    Google Scholar 

  29. Qian, N.: On the momentum term in gradient descent learning algorithms. Neural networks 12(1), 145–151 (1999)

    Google Scholar 

  30. Robbins, H., Monro, S.: A stochastic approximation method. The annals of mathematical statistics pp. 400–407 (1951)

    Google Scholar 

  31. Shaikhha, A., Fitzgibbon, A., Vytiniotis, D., Peyton Jones, S.: Efficient differentiable programming in a functional array-processing language. Proceedings of the ACM on Programming Languages 3(ICFP),  97 (2019)

    Google Scholar 

  32. Souriau, J.M.: Groupes différentiels. In: Differential geometrical methods in mathematical physics, pp. 91–128. Springer (1980)

    Google Scholar 

  33. Stacey, A.: Comparative smootheology. Theory Appl. Categ. 25(4), 64–117 (2011)

    Google Scholar 

  34. Wang, F., Wu, X., Essertel, G., Decker, J., Rompf, T.: Demystifying differentiable programming: Shift/reset the penultimate backpropagator. Proceedings of the ACM on Programming Languages 3(ICFP) (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathieu Huot .

Editor information

Editors and Affiliations

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and Permissions

Copyright information

© 2020 The Author(s)

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Huot, M., Staton, S., Vákár, M. (2020). Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing. In: Goubault-Larrecq, J., König, B. (eds) Foundations of Software Science and Computation Structures. FoSSaCS 2020. Lecture Notes in Computer Science(), vol 12077. Springer, Cham. https://doi.org/10.1007/978-3-030-45231-5_17

Download citation