Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost Everywhere

Mak, Carol; Ong, C.-H. Luke; Paquet, Hugo; Wagner, Dominik

doi:10.1007/978-3-030-72019-3_16

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12648))

Included in the following conference series:

European Symposium on Programming

4027 Accesses
10 Citations

Abstract

We study the differential properties of higher-order statistical probabilistic programs with recursion and conditioning. Our starting point is an open problem posed by Hongseok Yang: what class of statistical probabilistic programs have densities that are differentiable almost everywhere? To formalise the problem, we consider Statistical PCF (SPCF), an extension of call-by-value PCF with real numbers, and constructs for sampling and conditioning. We give SPCF a sampling-style operational semantics à la Borgström et al., and study the associated weight (commonly referred to as the density) function and value function on the set of possible execution traces.

Our main result is that almost surely terminating SPCF programs, generated from a set of primitive functions (e.g. the set of analytic functions) satisfying mild closure properties, have weight and value functions that are almost everywhere differentiable. We use a stochastic form of symbolic execution to reason about almost everywhere differentiability. A by-product of this work is that almost surely terminating deterministic (S)PCF programs with real parameters denote functions that are almost everywhere differentiable.

Our result is of practical interest, as almost everywhere differentiability of the density function is required to hold for the correctness of major gradient-based inference algorithms.

Download to read the full chapter text

Chapter PDF

Does a Program Yield the Right Distribution?

Probabilistic Analysis of Programs: A Weak Limit Approach

Weakest Precondition Reasoning for Expected Run–Times of Probabilistic Programs

References

Gilles Barthe, Raphaëlle Crubillé, Ugo Dal Lago, and Francesco Gavazzo. On the versatility of open logical relations. In European Symposium on Programming, pages 56–83. Springer, 2020.
Google Scholar
Sooraj Bhat, Ashish Agarwal, Richard W. Vuduc, and Alexander G. Gray. A type theory for probability density functions. In John Field and Michael Hicks, editors, Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2012, Philadelphia, Pennsylvania, USA, January 22-28, 2012, pages 545–556. ACM, 2012.
Google Scholar
Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. Deriving probability density functions from probabilistic functional programs. Logical Methods in Computer Science, 13(2), 2017.
Google Scholar
Benjamin Bichsel, Timon Gehr, and Martin T. Vechev. Fine-grained semantics for probabilistic programs. In Amal Ahmed, editor, Programming Languages and Systems - 27th European Symposium on Programming, ESOP 2018, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14-20, 2018, Proceedings, volume 10801 of Lecture Notes in Computer Science, pages 145–185. Springer, 2018.
Google Scholar
Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul A. Szerlip, Paul Horsfall, and Noah D. Goodman. Pyro: Deep universal probabilistic programming. J. Mach. Learn. Res., 20:28:1–28:6, 2019.
Google Scholar
David M Blei, Alp Kucukelbir, and Jon D McAuliffe. Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518):859–877, 2017.
Google Scholar
Jérôme Bolte and Edouard Pauwels. A mathematical model for automatic differentiation in machine learning. CoRR, abs/2006.02080, 2020.
Google Scholar
Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016, pages 33–46, 2016.
Google Scholar
Aloïs Brunel, Damiano Mazza, and Michele Pagani. Backpropagation in the simply typed lambda-calculus with linear negation. Proc. ACM Program. Lang., 4(POPL):64:1–64:27, 2020.
Google Scholar
Simon Castellan and Hugo Paquet. Probabilistic programming inference via intensional semantics. In European Symposium on Programming, pages 322–349. Springer, 2019.
Google Scholar
Arun Chaganty, Aditya Nori, and Sriram Rajamani. Efficiently sampling probabilistic programs via program analysis. In Artificial Intelligence and Statistics, pages 153–160, 2013.
Google Scholar
Lori A. Clarke. A system to generate test data and symbolically execute programs. IEEE Trans. Software Eng., 2(3):215–222, 1976.
Google Scholar
Ryan Culpepper and Andrew Cobb. Contextual equivalence for probabilistic programs with continuous random variables and scoring. In Hongseok Yang, editor, Programming Languages and Systems - 26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, volume 10201 of Lecture Notes in Computer Science, pages 368–392. Springer, 2017.
Google Scholar
Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. Gen: a general-purpose probabilistic programming system with programmable inference. In Kathryn S. McKinley and Kathleen Fisher, editors, Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, Phoenix, AZ, USA, June 22-26, 2019, pages 221–236. ACM, 2019.
Google Scholar
S. Duane, A. D. Kennedy, B. J. Pendleton, and D. Roweth. Hybrid monte carlo. Physics letters B, 1987.
Google Scholar
Thomas Ehrhard, Michele Pagani, and Christine Tasson. Measurable cones and stable, measurable functions: a model for probabilistic higher-order programming. PACMPL, 2(POPL):59:1–59:28, 2018.
Google Scholar
Thomas Ehrhard and Laurent Regnier. The differential lambda-calculus. Theor. Comput. Sci., 309(1-3):1–41, 2003.
Google Scholar
Matthew D. Hoffman, David M. Blei, Chong Wang, and John W. Paisley. Stochastic variational inference. J. Mach. Learn. Res., 14(1):1303–1347, 2013.
Google Scholar
Mathieu Huot, Sam Staton, and Matthijs Vákár. Correctness of automatic differentiation via diffeologies and categorical gluing. In Jean Goubault-Larrecq and Barbara König, editors, Foundations of Software Science and Computation Structures - 23rd International Conference, FOSSACS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, volume 12077 of Lecture Notesin Computer Science, pages 319–338. Springer, 2020.
Google Scholar
Chung-Kil Hur, Aditya V Nori, Sriram K Rajamani, and Selva Samuel. A provably correct sampler for probabilistic programs. In 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
Google Scholar
Wazim Mohammed Ismail and Chung-chieh Shan. Deriving a probability density calculator (functional pearl). In Jacques Garrigue, Gabriele Keller, and Eijiro Sumii, editors, Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016, pages 47–59. ACM, 2016.
Google Scholar
Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja. On the hardness of analyzing probabilistic programs. Acta Inf., 56(3):255–285, 2019.
Google Scholar
James C. King. Symbolic execution and program testing. Commun. ACM, 19(7):385–394, 1976.
Google Scholar
Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. In Yoshua Bengio and Yann LeCun, editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
Google Scholar
Oleg Kiselyov. Problems of the Lightweight Implementation of Probabilistic Programming. In PPS Workshop, 2016.
Google Scholar
Dexter Kozen. Semantics of probabilistic programs. In 20th Annual Symposium on Foundations of Computer Science, San Juan, Puerto Rico, 29-31 October 1979, pages 101–114, 1979.
Google Scholar
Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David M. Blei. Automatic variational inference in stan. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 568–576, 2015.
Google Scholar
Jeffrey M. Lee. Manifolds and Differential Geometry, volume 107 of Graduate Studies in Mathematics. AMS, 2009.
Google Scholar
John M. Lee. An introduction to smooth manifolds, volume 218 of Graduate Texts in Mathematics. Springer, second edition, 2013.
Google Scholar
Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. On correctness of automatic differentiation for non-differentiable functions. CoRR, abs/2006.06903, 2020.
Google Scholar
Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. Towards verified stochastic variational inference for probabilistic programs. PACMPL, 4(POPL):16:1–16:33, 2020.
Google Scholar
Wonyeol Lee, Hangyeol Yu, and Hongseok Yang. Reparameterization gradient for non-differentiable models. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada, pages 5558–5568, 2018.
Google Scholar
Alexander K Lew, Marco F Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K Mansinghka. Trace types and denotational semantics for sound programmable inference in probabilistic languages. Proceedings of the ACM on Programming Languages, 4(POPL):1–32, 2019.
Google Scholar
Carol Mak, C.-H. Luke Ong, Hugo Paquet, and Dominik Wagner. Densities of almost-surely terminating probabilistic programs are differentiable almost everywhere. CoRR, abs/2004.03924, 2020.
Google Scholar
Damiano Mazza and Michele Pagani. Automatic differentiation in pcf. Proc. ACM Program. Lang., 5(POPL), January 2021.
Google Scholar
Praveen Narayanan and Chung-chieh Shan. Symbolic disintegration with a variety of base measures. ACM Transactions on Programming Languages and Systems (TOPLAS), 42(2):1–60, 2020.
Google Scholar
Radford M Neal. Mcmc using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, page 113, 2011.
Google Scholar
Akihiko Nishimura, David B Dunson, and Jianfeng Lu. Discontinuous hamiltonian monte carlo for discrete parameters and discontinuous likelihoods. Biometrika, 107(2):365–380, Mar 2020.
Google Scholar
Akihiko Nishimura, David B Dunson, and Jianfeng Lu. Discontinuous Hamiltonian Monte Carlo for discrete parameters and discontinuous likelihoods. Biometrika, 03 2020. asz083.
Google Scholar
Rajesh Ranganath, Sean Gerrish, and David M. Blei. Black box variational inference. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, April 22-25, 2014, pages 814–822, 2014.
Google Scholar
Rajesh Ranganath, Sean Gerrish, and David M. Blei. Black box variational inference. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, April 22-25, 2014, volume 33 of JMLR Workshop and Conference Proceedings, pages 814–822. JMLR.org, 2014.
Google Scholar
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic backpropagation and approximate inference in deep generative models. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, volume 32 of JMLR Workshop and Conference Proceedings, pages 1278–1286. JMLR.org, 2014.
Google Scholar
Walter Rudin. Principles of Mathematical Analysis. International Series in Pure and Applied Mathematics. McGraw-Hill Education, 3rd edition edition, 1976.
Google Scholar
Nasser Saheb-Djahromi. Probabilistic lcf. In International Symposium on Mathematical Foundations of Computer Science, pages 442–451. Springer, 1978.
Google Scholar
Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K Moss, Chris Heunen, and Zoubin Ghahramani. Denotational validation of higher-order bayesian inference. Proceedings of the ACM on Programming Languages, 2(POPL):60, 2017.
Google Scholar
Dana S. Scott. A type-theoretical alternative to ISWIM, CUCH, OWHY. Theor. Comput. Sci., 121(1&2):411–440, 1993.
Google Scholar
Kurt Sieber. Relating full abstraction results for different programming languages. In Foundations of Software Technology and Theoretical Computer Science, Tenth Conference, Bangalore, India, December 17-19, 1990, Proceedings, pages 373–387, 1990.
Google Scholar
Sam Staton. Commutative semantics for probabilistic programming. In Hongseok Yang, editor, Programming Languages and Systems - 26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017,Uppsala, Sweden, April 22-29, 2017, Proceedings, volume 10201 of Lecture Notes in Computer Science, pages 855–879. Springer, 2017.
Google Scholar
Michalis K. Titsias and Miguel Lázaro-Gredilla. Doubly stochastic variational bayes for non-conjugate inference. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, volume 32 of JMLR Workshop and Conference Proceedings, pages 1971–1979. JMLR.org, 2014.
Google Scholar
Loring W. Tu. An introduction to manifolds. Universitext. Springer-Verlag, 2011.
Google Scholar
Matthijs Vákár, Ohad Kammar, and Sam Staton. A domain theory for statistical probabilistic programming. PACMPL, 3(POPL):36:1–36:29, 2019.
Google Scholar
Mitchell Wand, Ryan Culpepper, Theophilos Giannakopoulos, and Andrew Cobb. Contextual equivalence for a probabilistic language with continuous random variables and recursion. PACMPL, 2(ICFP):87:1–87:30, 2018.
Google Scholar
David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. Lightweight implementations of probabilistic programming languages via transformational compilation. In Geoffrey J. Gordon, David B. Dunson, and Miroslav Dudík, editors, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011, volume 15 of JMLR Proceedings, pages 770–778. JMLR.org, 2011.
Google Scholar
Hongseok Yang. Some semantic issues in probabilistic programming languages (invited talk). In Herman Geuvers, editor, 4th International Conference on Formal Structures for Computation and Deduction, FSCD 2019, June 24-30, 2019, Dortmund, Germany, volume 131 of LIPIcs, pages 4:1–4:6. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019.
Google Scholar
Yuan Zhou, Bradley J. Gram-Hansen, Tobias Kohn, Tom Rainforth, Hongseok Yang, and Frank Wood. LF-PPL: A low-level first order probabilistic programming language for non-differentiable models. In Kamalika Chaudhuri and Masashi Sugiyama, editors, The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16-18 April 2019, Naha, Okinawa, Japan, volume 89 of Proceedings of Machine Learning Research, pages 148–157. PMLR, 2019.
Google Scholar
Yuan Zhou, Hongseok Yang, Yee Whye Teh, and Tom Rainforth. Divide, conquer, and combine: a new inference strategy for probabilistic programs with stochastic support. CoRR, abs/1910.13324, 2019.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Oxford, Oxford, UK
Carol Mak, C.-H. Luke Ong, Hugo Paquet & Dominik Wagner

Authors

Carol Mak
View author publications
You can also search for this author in PubMed Google Scholar
C.-H. Luke Ong
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Paquet
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Wagner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominik Wagner .

Editor information

Editors and Affiliations

Imperial College, London, UK
Nobuko Yoshida

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mak, C., Ong, CH.L., Paquet, H., Wagner, D. (2021). Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost Everywhere. In: Yoshida, N. (eds) Programming Languages and Systems. ESOP 2021. Lecture Notes in Computer Science(), vol 12648. Springer, Cham. https://doi.org/10.1007/978-3-030-72019-3_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-72019-3_16
Published: 23 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72018-6
Online ISBN: 978-3-030-72019-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The European Joint Conferences on Theory and Practice of Software. (opens in a new tab)

Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost Everywhere

Abstract

Chapter PDF

Similar content being viewed by others

Does a Program Yield the Right Distribution?

Probabilistic Analysis of Programs: A Weak Limit Approach

Weakest Precondition Reasoning for Expected Run–Times of Probabilistic Programs

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost Everywhere

Abstract

Chapter PDF

Similar content being viewed by others

Does a Program Yield the Right Distribution?

Probabilistic Analysis of Programs: A Weak Limit Approach

Weakest Precondition Reasoning for Expected Run–Times of Probabilistic Programs

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation