Abstract
We study the differential properties of higher-order statistical probabilistic programs with recursion and conditioning. Our starting point is an open problem posed by Hongseok Yang: what class of statistical probabilistic programs have densities that are differentiable almost everywhere? To formalise the problem, we consider Statistical PCF (SPCF), an extension of call-by-value PCF with real numbers, and constructs for sampling and conditioning. We give SPCF a sampling-style operational semantics à la Borgström et al., and study the associated weight (commonly referred to as the density) function and value function on the set of possible execution traces.
Our main result is that almost surely terminating SPCF programs, generated from a set of primitive functions (e.g. the set of analytic functions) satisfying mild closure properties, have weight and value functions that are almost everywhere differentiable. We use a stochastic form of symbolic execution to reason about almost everywhere differentiability. A by-product of this work is that almost surely terminating deterministic (S)PCF programs with real parameters denote functions that are almost everywhere differentiable.
Our result is of practical interest, as almost everywhere differentiability of the density function is required to hold for the correctness of major gradient-based inference algorithms.
Chapter PDF
Similar content being viewed by others
References
Gilles Barthe, Raphaëlle Crubillé, Ugo Dal Lago, and Francesco Gavazzo. On the versatility of open logical relations. In European Symposium on Programming, pages 56–83. Springer, 2020.
Sooraj Bhat, Ashish Agarwal, Richard W. Vuduc, and Alexander G. Gray. A type theory for probability density functions. In John Field and Michael Hicks, editors, Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2012, Philadelphia, Pennsylvania, USA, January 22-28, 2012, pages 545–556. ACM, 2012.
Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. Deriving probability density functions from probabilistic functional programs. Logical Methods in Computer Science, 13(2), 2017.
Benjamin Bichsel, Timon Gehr, and Martin T. Vechev. Fine-grained semantics for probabilistic programs. In Amal Ahmed, editor, Programming Languages and Systems - 27th European Symposium on Programming, ESOP 2018, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14-20, 2018, Proceedings, volume 10801 of Lecture Notes in Computer Science, pages 145–185. Springer, 2018.
Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul A. Szerlip, Paul Horsfall, and Noah D. Goodman. Pyro: Deep universal probabilistic programming. J. Mach. Learn. Res., 20:28:1–28:6, 2019.
David M Blei, Alp Kucukelbir, and Jon D McAuliffe. Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518):859–877, 2017.
Jérôme Bolte and Edouard Pauwels. A mathematical model for automatic differentiation in machine learning. CoRR, abs/2006.02080, 2020.
Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016, pages 33–46, 2016.
Aloïs Brunel, Damiano Mazza, and Michele Pagani. Backpropagation in the simply typed lambda-calculus with linear negation. Proc. ACM Program. Lang., 4(POPL):64:1–64:27, 2020.
Simon Castellan and Hugo Paquet. Probabilistic programming inference via intensional semantics. In European Symposium on Programming, pages 322–349. Springer, 2019.
Arun Chaganty, Aditya Nori, and Sriram Rajamani. Efficiently sampling probabilistic programs via program analysis. In Artificial Intelligence and Statistics, pages 153–160, 2013.
Lori A. Clarke. A system to generate test data and symbolically execute programs. IEEE Trans. Software Eng., 2(3):215–222, 1976.
Ryan Culpepper and Andrew Cobb. Contextual equivalence for probabilistic programs with continuous random variables and scoring. In Hongseok Yang, editor, Programming Languages and Systems - 26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, volume 10201 of Lecture Notes in Computer Science, pages 368–392. Springer, 2017.
Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. Gen: a general-purpose probabilistic programming system with programmable inference. In Kathryn S. McKinley and Kathleen Fisher, editors, Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, Phoenix, AZ, USA, June 22-26, 2019, pages 221–236. ACM, 2019.
S. Duane, A. D. Kennedy, B. J. Pendleton, and D. Roweth. Hybrid monte carlo. Physics letters B, 1987.
Thomas Ehrhard, Michele Pagani, and Christine Tasson. Measurable cones and stable, measurable functions: a model for probabilistic higher-order programming. PACMPL, 2(POPL):59:1–59:28, 2018.
Thomas Ehrhard and Laurent Regnier. The differential lambda-calculus. Theor. Comput. Sci., 309(1-3):1–41, 2003.
Matthew D. Hoffman, David M. Blei, Chong Wang, and John W. Paisley. Stochastic variational inference. J. Mach. Learn. Res., 14(1):1303–1347, 2013.
Mathieu Huot, Sam Staton, and Matthijs Vákár. Correctness of automatic differentiation via diffeologies and categorical gluing. In Jean Goubault-Larrecq and Barbara König, editors, Foundations of Software Science and Computation Structures - 23rd International Conference, FOSSACS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, volume 12077 of Lecture Notesin Computer Science, pages 319–338. Springer, 2020.
Chung-Kil Hur, Aditya V Nori, Sriram K Rajamani, and Selva Samuel. A provably correct sampler for probabilistic programs. In 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
Wazim Mohammed Ismail and Chung-chieh Shan. Deriving a probability density calculator (functional pearl). In Jacques Garrigue, Gabriele Keller, and Eijiro Sumii, editors, Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016, pages 47–59. ACM, 2016.
Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja. On the hardness of analyzing probabilistic programs. Acta Inf., 56(3):255–285, 2019.
James C. King. Symbolic execution and program testing. Commun. ACM, 19(7):385–394, 1976.
Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. In Yoshua Bengio and Yann LeCun, editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
Oleg Kiselyov. Problems of the Lightweight Implementation of Probabilistic Programming. In PPS Workshop, 2016.
Dexter Kozen. Semantics of probabilistic programs. In 20th Annual Symposium on Foundations of Computer Science, San Juan, Puerto Rico, 29-31 October 1979, pages 101–114, 1979.
Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David M. Blei. Automatic variational inference in stan. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 568–576, 2015.
Jeffrey M. Lee. Manifolds and Differential Geometry, volume 107 of Graduate Studies in Mathematics. AMS, 2009.
John M. Lee. An introduction to smooth manifolds, volume 218 of Graduate Texts in Mathematics. Springer, second edition, 2013.
Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. On correctness of automatic differentiation for non-differentiable functions. CoRR, abs/2006.06903, 2020.
Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. Towards verified stochastic variational inference for probabilistic programs. PACMPL, 4(POPL):16:1–16:33, 2020.
Wonyeol Lee, Hangyeol Yu, and Hongseok Yang. Reparameterization gradient for non-differentiable models. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada, pages 5558–5568, 2018.
Alexander K Lew, Marco F Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K Mansinghka. Trace types and denotational semantics for sound programmable inference in probabilistic languages. Proceedings of the ACM on Programming Languages, 4(POPL):1–32, 2019.
Carol Mak, C.-H. Luke Ong, Hugo Paquet, and Dominik Wagner. Densities of almost-surely terminating probabilistic programs are differentiable almost everywhere. CoRR, abs/2004.03924, 2020.
Damiano Mazza and Michele Pagani. Automatic differentiation in pcf. Proc. ACM Program. Lang., 5(POPL), January 2021.
Praveen Narayanan and Chung-chieh Shan. Symbolic disintegration with a variety of base measures. ACM Transactions on Programming Languages and Systems (TOPLAS), 42(2):1–60, 2020.
Radford M Neal. Mcmc using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, page 113, 2011.
Akihiko Nishimura, David B Dunson, and Jianfeng Lu. Discontinuous hamiltonian monte carlo for discrete parameters and discontinuous likelihoods. Biometrika, 107(2):365–380, Mar 2020.
Akihiko Nishimura, David B Dunson, and Jianfeng Lu. Discontinuous Hamiltonian Monte Carlo for discrete parameters and discontinuous likelihoods. Biometrika, 03 2020. asz083.
Rajesh Ranganath, Sean Gerrish, and David M. Blei. Black box variational inference. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, April 22-25, 2014, pages 814–822, 2014.
Rajesh Ranganath, Sean Gerrish, and David M. Blei. Black box variational inference. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, April 22-25, 2014, volume 33 of JMLR Workshop and Conference Proceedings, pages 814–822. JMLR.org, 2014.
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic backpropagation and approximate inference in deep generative models. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, volume 32 of JMLR Workshop and Conference Proceedings, pages 1278–1286. JMLR.org, 2014.
Walter Rudin. Principles of Mathematical Analysis. International Series in Pure and Applied Mathematics. McGraw-Hill Education, 3rd edition edition, 1976.
Nasser Saheb-Djahromi. Probabilistic lcf. In International Symposium on Mathematical Foundations of Computer Science, pages 442–451. Springer, 1978.
Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K Moss, Chris Heunen, and Zoubin Ghahramani. Denotational validation of higher-order bayesian inference. Proceedings of the ACM on Programming Languages, 2(POPL):60, 2017.
Dana S. Scott. A type-theoretical alternative to ISWIM, CUCH, OWHY. Theor. Comput. Sci., 121(1&2):411–440, 1993.
Kurt Sieber. Relating full abstraction results for different programming languages. In Foundations of Software Technology and Theoretical Computer Science, Tenth Conference, Bangalore, India, December 17-19, 1990, Proceedings, pages 373–387, 1990.
Sam Staton. Commutative semantics for probabilistic programming. In Hongseok Yang, editor, Programming Languages and Systems - 26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017,Uppsala, Sweden, April 22-29, 2017, Proceedings, volume 10201 of Lecture Notes in Computer Science, pages 855–879. Springer, 2017.
Michalis K. Titsias and Miguel Lázaro-Gredilla. Doubly stochastic variational bayes for non-conjugate inference. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, volume 32 of JMLR Workshop and Conference Proceedings, pages 1971–1979. JMLR.org, 2014.
Loring W. Tu. An introduction to manifolds. Universitext. Springer-Verlag, 2011.
Matthijs Vákár, Ohad Kammar, and Sam Staton. A domain theory for statistical probabilistic programming. PACMPL, 3(POPL):36:1–36:29, 2019.
Mitchell Wand, Ryan Culpepper, Theophilos Giannakopoulos, and Andrew Cobb. Contextual equivalence for a probabilistic language with continuous random variables and recursion. PACMPL, 2(ICFP):87:1–87:30, 2018.
David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. Lightweight implementations of probabilistic programming languages via transformational compilation. In Geoffrey J. Gordon, David B. Dunson, and Miroslav Dudík, editors, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011, volume 15 of JMLR Proceedings, pages 770–778. JMLR.org, 2011.
Hongseok Yang. Some semantic issues in probabilistic programming languages (invited talk). In Herman Geuvers, editor, 4th International Conference on Formal Structures for Computation and Deduction, FSCD 2019, June 24-30, 2019, Dortmund, Germany, volume 131 of LIPIcs, pages 4:1–4:6. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019.
Yuan Zhou, Bradley J. Gram-Hansen, Tobias Kohn, Tom Rainforth, Hongseok Yang, and Frank Wood. LF-PPL: A low-level first order probabilistic programming language for non-differentiable models. In Kamalika Chaudhuri and Masashi Sugiyama, editors, The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16-18 April 2019, Naha, Okinawa, Japan, volume 89 of Proceedings of Machine Learning Research, pages 148–157. PMLR, 2019.
Yuan Zhou, Hongseok Yang, Yee Whye Teh, and Tom Rainforth. Divide, conquer, and combine: a new inference strategy for probabilistic programs with stochastic support. CoRR, abs/1910.13324, 2019.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2021 The Author(s)
About this paper
Cite this paper
Mak, C., Ong, CH.L., Paquet, H., Wagner, D. (2021). Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost Everywhere. In: Yoshida, N. (eds) Programming Languages and Systems. ESOP 2021. Lecture Notes in Computer Science(), vol 12648. Springer, Cham. https://doi.org/10.1007/978-3-030-72019-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-72019-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72018-6
Online ISBN: 978-3-030-72019-3
eBook Packages: Computer ScienceComputer Science (R0)