Commutative Semantics for Probabilistic Programming

  • Sam StatonEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10201)


We show that a measure-based denotational semantics for probabilistic programming is commutative.

The idea underlying probabilistic programming languages (Anglican, Church, Hakaru, etc.) is that programs express statistical models as a combination of prior distributions and likelihood of observations. The product of prior and likelihood is an unnormalized posterior distribution, and the inference problem is to find the normalizing constant. One common semantic perspective is thus that a probabilistic program is understood as an unnormalized posterior measure, in the sense of measure theory, and the normalizing constant is the measure of the entire semantic domain.

A programming language is said to be commutative if only data flow is meaningful; control flow is irrelevant, and expressions can be re-ordered. It has been unclear whether probabilistic programs are commutative because it is well-known that Fubini-Tonelli theorems for reordering integration fail in general. We show that probabilistic programs are in fact commutative, by characterizing the measures/kernels that arise from programs as ‘s-finite’, i.e. sums of finite measures/kernels.

The result is of theoretical interest, but also of practical interest, because program transformations based on commutativity help with symbolic inference and can improve the efficiency of simulation.


Probability Measure Lebesgue Measure Measurable Space Importance Sampling Sampling Semantic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



I am profoundly grateful to my coauthors on [43] for many discussions about the problems and examples in this subject (C. Heunen, O. Kammar, F. Wood, H. Yang). The MFPS 2016 special session on Probabilistic Programming was also illuminating and it was helpful to discuss these issues with the participants (J Börgstrom, D Roy, CC Shan and others). Thanks too to Adam Ścibior and the ESOP reviewers.

Research supported by a Royal Society University Research Fellowship.


  1. 1.
    Ackerman, N.L., Freer, C.E., Roy, D.M.: Noncomputable conditional distributions. In: Proceedings of the LICS 2011 (2011)Google Scholar
  2. 2.
    Atkey, R.: What is a categorical model of arrows? In: Proceedings of the MSFP 2008 (2008)Google Scholar
  3. 3.
    Borgström, J., Gordon, A.D., Greenberg, M., Margetson, J., van Gael, J.: Measure transformer semantics for Bayesian machine learning. LMCS 9(3), 11 (2013)Google Scholar
  4. 4.
    Borgström, J., Lago, U.D., Gordon, A.D., Szymczak, M.: A lambda-calculus foundation for universal probabilistic programming. In: Proceedings of the ICFP (2016)Google Scholar
  5. 5.
    Carette, J., Shan, C.-C.: Simplifying probabilistic programs using computer algebra. In: Gavanelli, M., Reppy, J. (eds.) PADL 2016. LNCS, vol. 9585, pp. 135–152. Springer, Cham (2016). doi: 10.1007/978-3-319-28228-2_9 CrossRefGoogle Scholar
  6. 6.
    Culpepper, R., Cobb, A.: Contextual equivalence for probabilistic programs with continuous random variables and scoring. In: Proceedings of the ESOP 2017 (2017, to appear)Google Scholar
  7. 7.
    Doberkat, E.E.: Stochastic Relations: Foundations for Markov Transition Systems. Chapman & Hall, London (2007)CrossRefzbMATHGoogle Scholar
  8. 8.
    Faissole, F., Spitters, B.: Synthetic topology in homotopy type theory for probabilistic programming. In: Proceedings of the PPS 2017 (2017)Google Scholar
  9. 9.
    Getoor, R.K.: Excessive Measures. Birkhäuser (1990)Google Scholar
  10. 10.
    Giry, M.: A categorical approach to probability theory. Categorical Aspects Topology Anal. 915, 68–85 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Goodman, N., Mansinghka, V., Roy, D.M., Bonawitz, K., Tenenbaum, J.B.: Church: a language for generative models. In: UAI (2008)Google Scholar
  12. 12.
    Haghverdi, E.: A categorical approach to linear logic, geometry of proofs and full completeness. Ph.D. thesis, Ottawa (2000)Google Scholar
  13. 13.
    Hermida, C.: Representable multicategories. Adv. Math. 151, 164–225 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Heunen, C., Kammar, O., Staton, S., Yang, H.: A convenient category for higher-order probability theory (2017). arXiv:1701.02547
  15. 15.
    Huang, D., Morrisett, G.: An application of computable distributions to the semantics of probabilistic programs: part 2. In: Proceedings of the PPS 2017 (2017)Google Scholar
  16. 16.
    Hughes, J.: Generalising monads to arrows. Sci. Comput. Program. 37(1–3), 67–111 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Jacobs, B., Heunen, C., Hasuo, I.: Categorical semantics for arrows. J. Funct. Program. 19(3–4), 403–438 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Jacobs, B., Zanasi, F.: A predicate/state transformer semantics for Bayesian learning. In: Proceedings of the MFPS 2016 (2016)Google Scholar
  19. 19.
    Jeffrey, A.: Premonoidal categories and a graphical view of programs. Unpublished (1997)Google Scholar
  20. 20.
    Kallenberg, O.: Stationary and invariant densities and disintegration kernels. Probab. Theory Relat. Fields 160, 567–592 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Kozen, D.: Semantics of probablistic programs. J. Comput. Syst. Sci. 22, 328–350 (1981)CrossRefzbMATHGoogle Scholar
  22. 22.
    Lambek, J.: Deductive systems and categories II. In: Hilton, P.J. (ed.) Category Theory, Homology Theory and Their Applications. LNM, vol. 86, pp. 76–122. Springer, Heidelberg (1969)CrossRefGoogle Scholar
  23. 23.
    Last, G., Penrose, M.: Lectures on the Poisson process. CUP (2016)Google Scholar
  24. 24.
    Leinster, T.: Higher operads, higher categories. CUP (2004)Google Scholar
  25. 25.
    Levy, P.B., Power, J., Thielecke, H.: Modelling environments in call-by-value programming languages. Inf. Comput. 185(2), 182–210 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Mansinghka, V.K., Selsam, D., Perov, Y.N.: Venture: a higher-order probabilistic programming platform with programmable inference (2014).
  27. 27.
    Mardare, R., Panangaden, P., Plotkin, G.: Quantitative algebraic reasoning. In: Proceedings of the LICS 2016 (2016)Google Scholar
  28. 28.
    Møgelberg, R.E., Staton, S.: Linear usage of state. Logical Methods Comput. Sci. 10 (2014)Google Scholar
  29. 29.
    Moggi, E.: Notions of computation and monads. Inf. Comput. 93(1), 55–92 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Narayanan, P., Carette, J., Romano, W., Shan, C., Zinkov, R.: Probabilistic inference by program transformation in Hakaru (system description). In: Kiselyov, O., King, A. (eds.) FLOPS 2016. LNCS, vol. 9613, pp. 62–79. Springer, Cham (2016). doi: 10.1007/978-3-319-29604-3_5 CrossRefGoogle Scholar
  31. 31.
    Paige, B., Wood, F.: A compilation target for probabilistic programming languages. In: ICML (2014)Google Scholar
  32. 32.
    Pollard, D.: A user’s guide to measure theoretic probability. CUP (2002)Google Scholar
  33. 33.
    Power, J.: Generic models for computational effects. TCS 364(2), 254–269 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Ramsey, N., Pfeffer, A.: Stochastic lambda calculus and monads of probability distributions. In: POPL (2002)Google Scholar
  35. 35.
    Ramsey, N.: All you need is the monad.. what monad was that again? In: PPS Workshop (2016)Google Scholar
  36. 36.
    Scherrer, C.: An exponential family basis for probabilistic programming. In: Proceedings of the PPS 2017 (2017)Google Scholar
  37. 37.
    Ścibor, A., Ghahramani, Z., Gordon, A.D.: Practical probabilistic programming with monads. In: Proceedings of the Haskell Symposium. ACM (2015)Google Scholar
  38. 38.
    Shan, C.C., Ramsey, N.: Symbolic Bayesian inference by symbolic disintegration (2016)Google Scholar
  39. 39.
    Sharpe, M.: General Theory of Markov Processes. Academic Press, Cambridge (1988)zbMATHGoogle Scholar
  40. 40.
    Stan Development Team: Stan: A C++ library for probability and sampling, version 2.5.0 (2014).
  41. 41.
    Staton, S.: Freyd categories are enriched Lawvere theories. In: Algebra, Coalgebra and Topology, ENTCS, vol. 303 (2013)Google Scholar
  42. 42.
    Staton, S., Levy, P.: Universal properties for impure programming languages. In: Proceedings of the POPL 2013 (2013)Google Scholar
  43. 43.
    Staton, S., Yang, H., Heunen, C., Kammar, O., Wood, F.: Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints. In: Proceedings of the LICS 2016 (2016)Google Scholar
  44. 44.
    Toronto, N., McCarthy, J., Van Horn, D.: Running probabilistic programs backwards. In: Vitek, J. (ed.) ESOP 2015. LNCS, vol. 9032, pp. 53–79. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-46669-8_3 CrossRefGoogle Scholar
  45. 45.
    Vickers, S.: A monad of valuation locales available from the author’s website (2011)Google Scholar
  46. 46.
    Wood, F., van de Meent, J.W., Mansinghka, V.: A new approach to probabilistic programming inference. In: AISTATS (2014)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.University of OxfordOxfordUK

Personalised recommendations