Relational Reasoning for Markov Chains in a Probabilistic Guarded Lambda Calculus
 2 Citations
 6.4k Downloads
Abstract
We extend the simplytyped guarded \(\lambda \)calculus with discrete probabilities and endow it with a program logic for reasoning about relational properties of guarded probabilistic computations. This provides a framework for programming and reasoning about infinite stochastic processes like Markov chains. We demonstrate the logic sound by interpreting its judgements in the topos of trees and by using probabilistic couplings for the semantics of relational assertions over distributions on discrete types.
The program logic is designed to support syntaxdirected proofs in the style of relational refinement types, but retains the expressiveness of higherorder logic extended with discrete distributions, and the ability to reason relationally about expressions that have different types or syntactic structure. In addition, our proof system leverages a wellknown theorem from the coupling literature to justify better proof rules for relational reasoning about probabilistic expressions. We illustrate these benefits with a broad range of examples that were beyond the scope of previous systems, including shift couplings and lump couplings between random walks.
1 Introduction
Modelling Probabilistic Infinite Objects. A first challenge is to model probabilistic infinite objects. We focus on the case of Markov chains, due to its importance. A (discretetime) Markov chain is a sequence of random variables \(\{X_i\}\) over some fixed type T satisfying some independence property. Thus, the straightforward way of modelling a Markov chain is as a stream of distributions over T. Going back to the simple example outlined above, it is natural to think about this kind of discretetime Markov chain as characterized by the sequence of positions \(\{p_i\}_{i \in \mathbb {N}}\), which in turn can be described as an infinite set indexed by the natural numbers. This suggests that a natural way to model such a Markov chain is to use streams in which each element is produced probabilistically from the previous one. However, there are some downsides to this representation. First of all, it requires explicit reasoning about probabilistic dependency, since \(X_{i+1}\) depends on \(X_i\). Also, we might be interested in global properties of the executions of the Markov chain, such as “The probability of passing through the initial state infinitely many times is 1”. These properties are naturally expressed as properties of the whole stream. For these reasons, we want to represent Markov chains as distributions over streams. Seemingly, one downside of this representation is that the set of streams is not countable, which suggests the need for introducing heavy measuretheoretic machinery in the semantics of the programming language, even when the underlying type is discrete or finite.
Fortunately, measuretheoretic machinery can be avoided (for discrete distributions) by developing a probabilistic extension of the simplytyped guarded \(\lambda \)calculus and giving a semantic interpretation in the topos of trees [1]. Informally, the simplytyped guarded \(\lambda \)calculus [1] extends the simplytyped lambda calculus with a later modality, denoted by Open image in new window . The type Open image in new window ascribes expressions that are available one unit of logical time in the future. The Open image in new window modality allows one to model infinite types by using “finite” approximations. For example, a stream of natural numbers is represented by the sequence of its (increasing) prefixes in the topos of trees. The prefix containing the first i elements has the type Open image in new window , representing that the first element is available now, the second element a unit time in the future, and so on. This is the key to representing probability distributions over infinite objects without measuretheoretic semantics: We model probability distributions over nondiscrete sets as discrete distributions over their (the sets’) approximations. For example, a distribution over streams of natural numbers (which a priori would be nondiscrete since the set of streams is uncountable) would be modelled by a sequence of distributions over the finite approximations \(S_1, S_2, \ldots \) of streams. Importantly, since each \(S_i\) is countable, each of these distributions can be discrete.
In more detail, our relational logic comes with typing rules that allow one to reason about relational properties by exploiting as much as possible the syntactic similarities between \(t_1\) and \(t_2\), and to fall back on pure logical reasoning when these are not available. In order to apply relational reasoning to guarded computations the logic provides relational rules for the later modality Open image in new window and for a related modality \(\square {}\), called “constant”. These rules allow the relational verification of general relational properties that go beyond the traditional notion of program equivalence and, moreover, they allow the verification of properties of guarded computations over different types. The ability to reason about computations of different types provides significant benefits over alternative formalisms for relational reasoning. For example, it enables reasoning about relations between programs working on different data structures, e.g. a relation between a program working on a stream of natural numbers, and a program working on a stream of pairs of natural numbers, or having different structures, e.g. a relation between an application and a case expression.
Importantly, our approach for reasoning formally about probabilistic computations is based on probabilistic couplings, a standard tool from the analysis of Markov chains [3, 4]. From a verification perspective, probabilistic couplings go beyond equivalence properties of probabilistic programs, which have been studied extensively in the verification literature, and yet support compositional reasoning [5, 6]. The main attractive feature of couplingbased reasoning is that it limits the need of explicitly reasoning about the probabilities—this avoids complex verification conditions. We provide sound proof rules for reasoning about probabilistic couplings. Our rules make several improvements over prior relational verification logics based on couplings. First, we support reasoning over probabilistic processes of different types. Second, we use Strassen’s theorem [7] a remarkable result about probabilistic couplings, to achieve greater expressivity. Previous systems required to prove a bijection between the sampling spaces to show the existence of a coupling [5, 6], Strassen’s theorem gives a way to show their existence which is applicable in settings where the bijectionbased approach cannot be applied. And third, we support reasoning with what are called shift couplings, coupling which permits to relate the states of two Markov chains at possibly different times (more explanations below).
Case Studies. We show the flexibility of our formalism by verifying several examples of relational properties of probabilistic computations, and Markov chains in particular. These examples cannot be verified with existing approaches.
First, we verify a classic example of probabilistic noninterference which requires the reasoning about computations at different types. Second, in the context of Markov chains, we verify an example about stochastic dominance which exercises our more general rule for proving the existence of couplings modelled by expressions of different types. Finally, we verify an example involving shift relations in an infinite computation. This style of reasoning is motivated by “shift” couplings in Markov chains. In contrast to a standard coupling, which relates the states of two Markov chains at the same time t, a shift coupling relates the states of two Markov chains at possibly different times. Our specific example relates a standard random walk (described earlier) to a variant called a lazy random walk; the verification requires relating the state of standard random walk at time t to the state of the lazy random walk at time 2t. We note that this kind of reasoning is impossible with conventional relational proof rules even in a nonprobabilistic setting. Therefore, we provide a novel family of proof rules for reasoning about shift relations. At a high level, the rules combine a careful treatment of the later and constant modalities with a refined treatment of fixpoint operators, allowing us to relate different iterates of function bodies.
1.1 Summary of Contributions
 1.
A probabilistic extension of the guarded \(\lambda \)calculus, that enables the definition of Markov chains as discrete probability distributions over streams.
 2.
A relational logic based on coupling to reason in a syntaxdirected manner about (relational) properties of Markov chains. This logic supports reasoning about programs that have different types and structures. Additionally, this logic uses results from the coupling literature to achieve greater expressivity than previous systems.
 3.
An extension of the relational logic that allows to relate the states of two streams at possibly different times. This extension supports reasoning principles, such as shift couplings, that escape conventional relational logics.
Omitted technical details can be found in the full version of the paper with appendix at https://arxiv.org/abs/1802.09787.
2 Mathematical Preliminaries
This section reviews the definition of discrete probability subdistributions and introduces mathematical couplings.
Definition 1 (Discrete probability distribution)
Let C be a discrete (i.e., finite or countable) set. A (total) distribution over C is a function \(\mu : C \rightarrow [0,1]\) such that \( \sum _{x\in C} \mu (x) = 1\). The support of a distribution \(\mu \) is the set of points with nonzero probability, \( \mathsf {supp}\ \mu \triangleq \{x \in C \mid \mu (x) > 0 \}\). We denote the set of distributions over C as \(\mathsf {D}(C)\). Given a subset \(E \subseteq C\), the probability of sampling from \(\mu \) a point in E is denoted \(\Pr _{x\leftarrow \mu }[x \in E]\), and is equal to \(\sum _{x \in E} \mu (x)\).
Definition 2 (Marginals)
Probabilistic Couplings. Probabilistic couplings are a fundamental tool in the analysis of Markov chains. When analyzing a relation between two probability distributions it is sometimes useful to consider instead a distribution over the product space that somehow “couples” the randomness in a convenient manner.
We now review the definition of couplings and state relevant properties.
Definition 3 (Couplings)
Let \(\mu _1\in \mathsf {D}(C_1)\) and \(\mu _2\in \mathsf {D}(C_2)\), and \(R\subseteq C_1\times C_2\).

A distribution \(\mu \in \mathsf {D}(C_1\times C_2)\) is a coupling for \(\mu _1\) and \(\mu _2\) iff its first and second marginals coincide with \(\mu _1\) and \(\mu _2\) respectively, i.e. \(\mathsf {D}(\pi _1)(\mu )=\mu _1\) and \(\mathsf {D}(\pi _2)(\mu )=\mu _2\).

A distribution \(\mu \in \mathsf {D}(C_1\times C_2)\) is a Rcoupling for \(\mu _1\) and \(\mu _2\) if it is a coupling for \(\mu _1\) and \(\mu _2\) and, moreover, \(\Pr _{(x_1,x_2)\leftarrow \mu } [R~x_1~x_2]=1\), i.e., if the support of the distribution \(\mu \) is included in R.
Moreover, we write \(\diamond _{\mu _1, \mu _2}. R\) iff there exists a Rcoupling for \(\mu _1\) and \(\mu _2\).
Theorem 1 (Strassen’s theorem)
Consider \(\mu _1\in \mathsf {D}(C_1)\) and \(\mu _2\in \mathsf {D}(C_2)\), and \(R\subseteq C_1 \times C_2\). Then \(\diamond _{\mu _1, \mu _2}. R\) iff for every \(X \subseteq C_1\), \(\Pr _{x_1\leftarrow \mu _1}[x_1\in X] \le \Pr _{x_2\leftarrow \mu _2}[x_2\in R(X)]\), where R(X) is the image of X under R, i.e. \(R(X) =\{ y \in C_2 \mid \exists x \in X.~R~x~y\}\).
An important property of couplings is closure under sequential composition.
Lemma 1 (Sequential composition couplings)
We conclude this section with the following lemma, which follows from Strassen’s theorem:
Lemma 2 (Fundamental lemma of couplings)
This lemma can be used to prove probabilistic inequalities from the existence of suitable couplings:
Corollary 1
 1.
If \(\diamond _{\mu _1, \mu _2}. (=)\), then for all \(x\in C\), \(\mu _1(x) = \mu _2(x)\).
 2.
If \(C = \mathbb {N}\) and \(\diamond _{\mu _1, \mu _2}. (\ge )\), then for all \(n\in \mathbb {N}\), \(\Pr _{x\leftarrow \mu _1}[x\ge n] \ge \Pr _{x\leftarrow \mu _2}[x\ge n]\)
3 Overview of the System
In this section we give a highlevel overview of our system, with the details on Sects. 4, 5 and 6. We start by presenting the base logic, and then we show how to extend it with probabilities and how to build a relational reasoning system on top of it.
3.1 Base Logic: Guarded HigherOrder Logic
3.2 A System for Relational Reasoning
We then extend Guarded HOL with a modality \(\diamond \) that lifts assertions over discrete types \(C_1\) and \(C_2\) to assertions over \(\mathsf {D}(C_1)\) and \(\mathsf {D}(C_2)\). Concretely, we define for every assertion \(\phi \), variables \(x_1\) and \(x_2\) of type \(C_1\) and \(C_2\) respectively, and expressions \(t_1\) and \(t_2\) of type \(\mathsf {D}(C_1)\) and \(\mathsf {D}(C_2)\) respectively, the modal assertion \(\diamond _{ [x_1\leftarrow t_1,x_2\leftarrow t_2]} \phi \) which holds iff the interpretations of \(t_1\) and \(t_2\) are related by the probabilistic lifting of the interpretation of \(\phi \). We call this new logic Probabilistic Guarded HOL.
Informally, the rule stipulates the existence of an invariant \(\phi \) over states. The first premise insists that the invariant hold on the initial states, the condition \(\psi _3\) states that the transition functions preserve the invariant, and \(\psi _4\) states that the invariant \(\phi \) over pairs of states can be lifted to a stream property \(\phi '\).
Other rules of the logic are given in Fig. 1. The language construct \(\mathsf {munit}\) creates a point distribution whose entire mass is at its argument. Accordingly, the [UNIT] rule creates a straightforward coupling. The [MLET] rule internalizes sequential composition of couplings (Lemma 1) into the proof system. The construct \(\mathsf {let}~x=t~\mathsf {in}~t'\) composes a distribution t with a probabilistic computation \(t'\) with one free variable x by sampling x from t and running \(t'\). The [MLETL] rule supports onesided reasoning about \(\mathsf {let}~x=t~\mathsf {in}~t'\) and relies on the fact that couplings are closed under convex combinations. Note that one premise of the rule uses a unary judgement, with a nonrelational modality \(\diamond _{[x\leftarrow \mathbf {r}]} \phi \) whose informal meaning is that \(\phi \) holds with probability 1 in the distribution \(\mathbf {r}\).
3.3 Examples
We formalize elementary examples from the literature on security and Markov chains. None of these examples can be verified in prior systems. Uniformity of onetime pad and lumping of random walks cannot even be stated in prior systems because the two related expressions in these examples have different types. The random walk vs lazy random walk (shift coupling) cannot be proved in prior systems because it requires either asynchronous reasoning or code rewriting. Finally, the biased coin example (stochastic dominance) cannot be proved in prior work because it requires Strassen’s formulation of the existence of coupling (rather than a bijectionbased formulation) or code rewriting. We give additional details below.
OneTime Pad/Probabilistic Noninterference. Noninterference [8] is a baseline information flow policy that is often used to model confidentiality of computations. In its simplest form, noninterference distinguishes between public (or low) and private (or high) variables and expressions, and requires that the result of a public expression not depend on the value of its private parameters. This definition naturally extends to probabilistic expressions, except that in this case the evaluation of an expression yields a distribution rather than a value. There are deep connections between probabilistic noninterference and several notions of (informationtheoretic) security from cryptography. In this paragraph, we illustrate different flavours of security properties for onetime pad encryption. Similar reasoning can be carried out for proving (passive) security of secure multiparty computation algorithms in the 3party or multiparty setting [9, 10].
Onetime pad is a perfectly secure symmetric encryption scheme. Its space of plaintexts, ciphertexts and keys is the set \(\{0,1\}^\ell \)—fixedlength bitstrings of size \(\ell \). The encryption algorithm is parametrized by a key k—sampled uniformly over the set of bitstrings \(\{ 0,1 \}^\ell \)—and maps every plaintext m to the ciphertext \(c = k \oplus m\), where the operator \(\oplus \) denotes bitwise exclusiveor on bitstrings. We let \(\mathsf {otp}\) denote the expression \(\lambda m. \mathsf {let}~k=\mathcal {U}_{\{0,1\}^\ell }~\mathsf {in}~\mathsf {munit}(k\oplus m)\), where \(\mathcal {U}_{X}\) is the uniform distribution over a finite set X.
Stochastic Dominance. Stochastic dominance defines a partial order between random variables whose underlying set is itself a partial order; it has many different applications in statistical biology (e.g. in the analysis of the birthanddeath processes), statistical physics (e.g. in percolation theory), and economics. Firstorder stochastic dominance, which we define below, is also an important application of probabilistic couplings. We demonstrate how to use our proof system for proving (firstorder) stochastic dominance for a simple Markov process which samples biased coins. While the example is elementary, the proof method extends to more complex examples of stochastic dominance, and illustrates the benefits of Strassen’s formulation of the coupling rule over alternative formulations stipulating the existence of bijections (explained later).
We start by recalling the definition of (firstorder) stochastic dominance for the \(\mathbb {N}\)valued case. The definition extends to arbitrary partial orders.
Definition 4 (Stochastic dominance)
The following result, equivalent to Corollary 1, characterizes stochastic dominance using probabilistic couplings.
Proposition 1
Let \(\mu _1,\mu _2\in \mathsf {D}(\mathbb {N})\). Then \(\mu _1\le _{\mathrm {SD}} \mu _2\) iff \(\diamond _{\mu _1, \mu _2}. (\le )\).
It is instructive to compare our proof with prior formalizations, and in particular with the proof in [5]. Their proof is carried out in the pRHL logic, whose [COUPLING] rule is based on the existence of a bijection that satisfies some property, rather than on our formalization based on Strassen’s Theorem. Their rule is motivated by applications in cryptography, and works well for many examples, but is inconvenient for our example at hand, which involves nonuniform probabilities. Indeed, their proof is based on code rewriting, and is done in two steps. First, they prove equivalence between sampling and returning \(x_1\) from \(\mathcal {B}(p_1)\); and sampling \(z_1\) from \(\mathcal {B}(p_2)\), \(z_2\) from \(\mathcal {B}({}^{p_1}\!/\!_{p_2})\) and returning \(z= z_1 \wedge z_2\). Then, they find a coupling between z and \(\mathcal {B}(p_2)\).
Shift Coupling: Random Walk vs Lazy Random Walk. The previous example is an instance of a lockstep coupling, in that it relates the kth element of the first chain with the kth element of the second chain. Many examples from the literature follow this lockstep pattern; however, it is not always possible to establish lockstep couplings. Shift couplings are a relaxation of lockstep couplings where we relate elements of the first and second chains without the requirement that their positions coincide.
Lumped Coupling: Random Walks on 3 and 4 Dimensions. A Markov chain is recurrent if it has probability 1 of returning to its initial state, and transient otherwise. It is relatively easy to show that the random walk over \(\mathbb {Z}\) is recurrent. One can also show that the random walk over \(\mathbb {Z}^2\) is recurrent. However, the random walk over \(\mathbb {Z}^3\) is transient.
For higher dimensions, we can use a coupling argument to prove transience. Specifically, we can define a coupling between a lazy random walk in n dimensions and a random walk in \(n +m\) dimensions, and derive transience of the latter from transience of the former. We define the (lazy) random walks below, and sketch the coupling arguments.
4 Probabilistic Guarded Lambda Calculus
The guarded lambda calculus solves the productivity problem by distinguishing at type level between data that is available now and data that will be available in the future, and restricting when fixpoints can be defined. Specifically, the guarded lambda calculus extends the usual simply typed lambda calculus with two modalities: Open image in new window (pronounced later) and Open image in new window (constant). The later modality represents data that will be available one step in the future, and is introduced and removed by the term formers Open image in new window and \(\mathrm{prev}\ \) respectively. This modality is used to guard recursive occurrences, so for the calculus to remain productive, we must restrict when it can be eliminated. This is achieved via the constant modality, which expresses that all the data is available at all times. In the remainder of this section we present a probabilistic extension of this calculus.
Delayed substitutions were introduced in [13] in a dependent type theory to be able to work with types dependent on terms of type Open image in new window . In the setting of a simple type theory, such as the one considered in this paper, delayed substitutions are equivalent to having the applicative structure [14] \(\circledast \) for the Open image in new window modality. However, delayed substitutions extend uniformly to the level of propositions, and thus we choose to use them in this paper in place of the applicative structure.
Denotational Semantics. The meaning of terms is given by a denotational model in the category \(\mathcal {S}\) of presheaves over \(\omega \), the first infinite ordinal. This category \(\mathcal {S}\) is also known as the topos of trees [15]. In previous work [1], it was shown how to model most of the constructions of the guarded lambda calculus and its internal logic, with the notable exception of the probabilistic features. Below we give an elementary presentation of the semantics.
Informally, the idea behind the topos of trees is to represent (infinite) objects from their finite approximations, which we observe incrementally as time passes. Given an object x, we can consider a sequence \(\{x_i\}\) of its finite approximations observable at time i. These are trivial for finite objects, such as a natural number, since for any number n, \(n_i = n\) at every i. But for infinite objects such as streams, the ith approximation is the prefix of length \(i+1\).

Objects X: families of sets \(\{X_i\}_{i\in \mathbb {N}}\) together with restriction functions \(r_n^X : X_{n+1} \rightarrow X_n\). We will write simply \(r_n\) if X is clear from the context.

Morphisms \(X \rightarrow Y\) : families of functions \(\alpha _n : X_n \rightarrow Y_n\) commuting with restriction functions in the sense of \(r_n^Y \circ \alpha _{n+1} = \alpha _n \circ r_n^X\).
 Streams over a type A are interpreted as sequences of finite prefixes of elements of A with the restriction functions of A:
 Distributions over a discrete object C are defined as a sequence of distributions over each \(\llbracket C \rrbracket _i\): where \(\mathsf {D}(\llbracket C \rrbracket _i)\) is the set of (probability density) functions \(\mu : \llbracket C \rrbracket _i \rightarrow [0,1]\) such that \(\sum _{x_\in X} \mu x = 1\), and \(\mathsf {D}(r_i)\) adds the probability density of all the points in \(\llbracket C \rrbracket _{i+1}\) that are sent by \(r_i\) to the same point in the \(\llbracket C \rrbracket _{i}\). In other words, \(\mathsf {D}(r_i)(\mu )(x) = \Pr _{y \leftarrow \mu }[r_i(y) = x]\)
An important property of the interpretation is that discrete types are interpreted as objects X such that \(X_i\) is finite or countably infinite for every i. This allows us to define distributions on these objects without the need for measure theory. In particular, the type of guarded streams Open image in new window is discrete provided A is, which is clear from the interpretation of the type Open image in new window . Conceptually this holds because Open image in new window is an approximation of real streams, consisting of only the first \(i+1\) elements.
An object X of \(\mathcal {S}\) is constant if all its restriction functions are bijections. Constant types are interpreted as constant objects of \(\mathcal {S}\) and for a constant type A the objects \(\llbracket \square A \rrbracket \) and \(\llbracket A \rrbracket \) are isomorphic in \(\mathcal {S}\).
Typing Rules. Terms are typed under a dual context \(\varDelta \mid \varGamma \), where \(\varGamma \) is a usual context that binds variables to a type, and \(\varDelta \) is a constant context containing variables bound to types that are constant. The term \(\mathrm{letc}\ x \leftarrow u\ \mathrm{in}\ t\) allows us to shift variables between constant and nonconstant contexts. The typing rules can be found in Fig. 2.
The semantics of such a dual context \(\varDelta \mid \varGamma \) is given as the product of types in \(\varDelta \) and \(\varGamma \), except that we implicitly add \(\square \) in front of every type in \(\varDelta \). In the particular case when both contexts are empty, the semantics of the dual context correspond to the terminal object 1, which is the singleton set \(\{*\}\) at each time.
5 Guarded HigherOrder Logic
The cases for the semantics of the judgement \(\varDelta \mid \varGamma \vdash \phi \) can be found in the appendix. It can be shown that this logic is sound with respect to its model in the topos of trees.
Theorem 2 (Soundness of the semantics)
The semantics of guarded higherorder logic is sound: if \(\varDelta \mid \varSigma \mid \varGamma \mid \varPsi \vdash \phi \) is derivable then for all \(n \in \mathbb {N}\), \(\llbracket \square \varSigma \rrbracket _n \cap \llbracket \varPsi \rrbracket _n \subseteq \llbracket \phi \rrbracket \).
In addition, Guarded HOL is expressive enough to axiomatize standard probabilities over discrete sets. This axiomatization can be used to define the \(\diamond \) modality directly in Guarded HOL (as opposed to our relational proof system, were we use it as a primitive). Furthermore, we can derive from this axiomatization additional rules to reason about couplings, which can be seen in Fig. 4. These rules will be the key to proving the soundness of the probabilistic fragment of the relational proof system, and can be shown to be sound themselves.
Proposition 2 (Soundness of derived rules)
The additional rules are sound.
6 Relational Proof System
We complete the formal description of the system by describing the proof rules for the nonprobabilistic fragment of the relational proof system (the rules of the probabilistic fragment were described in Sect. 3.2).
6.1 Proof Rules
The rules for core \(\lambda \)calculus constructs are identical to those of [2]; for convenience, we present a selection of the main rules in Fig. 7 in the appendix.
We briefly comment on the twosided rules for the new constructs (Fig. 5). The notation \(\varOmega \) abbreviates a context \(\varDelta \mid \varSigma \mid \varGamma \mid \varPsi \). The rule [Next] relates two terms that have a Open image in new window term constructor at the top level. We require that both have one term in the delayed substitutions and that they are related pairwise. Then this relation is used to prove another relation between the main terms. This rule can be generalized to terms with more than one term in the delayed substitution. The rule [Prev] proves a relation between terms from the same delayed relation by applying \(\mathrm {prev}\) to both terms. The rule [Box] proves a relation between two boxed terms if the same relation can be proven in a constant context. Dually, [LetBox] uses a relation between two boxed terms to prove a relation between their unboxings. [LetConst] is similar to [LetBox], but it requires instead a relation between two constant terms, rather than explicitly \(\square \)ed terms. The rule [Fix] relates two fixpoints following the [Loeb] rule from Guarded HOL. Notice that in the premise, the fixpoints need to appear in the delayed substitution so that the inductive hypothesis is wellformed. The rule [Cons] proves relations on streams from relations between their heads and tails, while [Head] and [Tail] behave as converses of [Cons].
Figure 6 contains the onesided versions of the rules. We only present the leftsided versions as the rightsided versions are completely symmetric. The rule [NextL] relates at \(\phi \) a term that has a Open image in new window with a term that does not have a Open image in new window . First, a unary property \(\phi '\) is proven on the term u in the delayed substitution, and it is then used as a premise to prove \(\phi \) on the terms with delays removed. Rules for proving unary judgements can be found in the appendix. Similarly, [LetBoxL] proves a unary property on the term that gets unboxed and then uses it as a precondition. The rule [FixL] builds a fixpoint just on the left, and relates it with an arbitrary term \(t_2\) at a property \(\phi \). Since \(\phi \) may contain the variable \(\mathbf {r}_2\) which is not in the context, it has to be replaced when adding Open image in new window to the logical context in the premise of the rule. The remaining rules are similar to their twosided counterparts.
6.2 Metatheory
We review some of the most interesting metatheoretical properties of our relational proof system, highlighting the equivalence with Guarded HOL.
Theorem 3 (Equivalence with Guarded HOL)
The forward implication follows by induction on the given derivation. The reverse implication is immediate from the rule which allows to fall back on Guarded HOL in relational proofs. (Rule [SUB] in the appendix). The full proof is in the appendix. The consequence of this theorem is that the syntaxdirected, relational proof system we have built on top of Guarded HOL does not lose expressiveness.
Corollary 2 (Soundness and consistency)
6.3 Shift Couplings Revisited
In fact, we can assume that, in general, we have a family of \({\text {All}}_{m_1, m_2}\) predicates relating two streams at positions \(m_1\cdot i\) and \(m_2\cdot i\) for every i.
This asynchronous rule for Markov chains shares the motivations of the rule for loops proposed in [6]. Note that one can define a rule [Markovmn] for arbitrary m and n to prove a judgement of the form \({\text {All}}_{m,n}\) on two Markov chains.
7 Related Work
Our probabilistic guarded \(\lambda \)calculus and the associated logic Guarded HOL build on top of the guarded \(\lambda \)calculus and its internal logic [1]. The guarded \(\lambda \)calculus has been extended to guarded dependent type theory [13], which can be understood as a theory of guarded refinement types and as a foundation for proof assistants based on guarded type theory. These systems do not reason about probabilities, and do not support syntaxdirected (relational) reasoning, both of which we support.
Relational models for higherorder programming languages are often defined using logical relations. [16] showed how to use secondorder logic to define and reason about logical relations for the secondorder lambda calculus. Recent work has extended this approach to logical relations for higherorder programming languages with computational effects such as nontermination, general references, and concurrency [17, 18, 19, 20]. The logics used in loc. cit. are related to our work in two ways: (1) the logics in loc. cit. make use of the later modality for reasoning about recursion, and (2) the models of the logics in loc. cit. can in fact be defined using guarded type theory. Our work is more closely related to Relational Higher Order Logic [2], which applies the idea of logicenriched type theories [21, 22] to a relational setting. There exist alternative approaches for reasoning about relational properties of higherorder programs; for instance, [23] have recently proposed to use monadic reification for reducing relational verification of \(F^*\) to proof obligations in higherorder logic.
A series of work develops reasoning methods for probabilistic higherorder programs for different variations of the lambda calculus. One line of work has focused on operationallybased techniques for reasoning about contextual equivalence of programs. The methods are based on probabilistic bisimulations [24, 25] or on logical relations [26]. Most of these approaches have been developed for languages with discrete distributions, but recently there has also been work on languages with continuous distributions [27, 28]. Another line of work has focused on denotational models, starting with the seminal work in [29]. Recent work includes support for relational reasoning about equivalence of programs with continuous distributions for a total programming language [30]. Our approach is most closely related to prior work based on relational refinement types for higherorder probabilistic programs. These were initially considered by [31] for a stateful fragment of \(F^*\), and later by [32, 33] for a pure language. Both systems are specialized to building probabilistic couplings; however, the latter support approximate probabilistic couplings, which yield a natural interpretation of differential privacy [34], both in its vanilla and approximate forms (i.e. \(\epsilon \) and \((\epsilon ,\delta )\)privacy). Technically, approximate couplings are modelled as a graded monad, where the index of the monad tracks the privacy budget (\(\epsilon \) or \((\epsilon ,\delta )\)). Both systems are strictly syntaxdirected, and cannot reason about computations that have different types or syntactic structures, while our system can.
8 Conclusion
We have developed a probabilistic extension of the (simply typed) guarded \(\lambda \)calculus, and proposed a syntaxdirected proof system for relational verification. Moreover, we have verified a series of examples that are beyond the reach of prior work. Finally, we have proved the soundness of the proof system with respect to the topos of trees.
There are several natural directions for future work. One first direction is to enhance the expressiveness of the underlying simply typed language. For instance, it would be interesting to introduce clock variables and some type dependency as in [13], and extend the proof system accordingly. This would allow us, for example, to type the function taking the nth element of a guarded stream, which cannot be done in the current system. Another exciting direction is to consider approximate couplings, as in [32, 33], and to develop differential privacy for infinite streams—preliminary work in this direction, such as [35], considers very large lists, but not arbitrary streams. A final direction would be to extend our approach to continuous distributions to support other application domains.
Notes
Acknowledgments
We would like to thank the anonymous reviewers for their time and their helpful input. This research was supported in part by the ModuRes Sapere Aude Advanced Grant from The Danish Council for Independent Research for the Natural Sciences (FNU), by a research grant (12386, Guarded Homotopy Type Theory) from the VILLUM foundation, and by NSF under grant 1718220.
References
 1.Clouston, R., Bizjak, A., Grathwohl, H.B., Birkedal, L.: The guarded lambdacalculus: programming and reasoning with guarded recursion for coinductive types. Log. Methods Comput. Sci. 12(3) (2016)Google Scholar
 2.Aguirre, A., Barthe, G., Gaboardi, M., Garg, D., Strub, P.: A relational logic for higherorder programs. PACMPL 1(ICFP), 21:1–21:29 (2017)CrossRefGoogle Scholar
 3.Lindvall, T.: Lectures on the Coupling Method. Courier Corporation (2002)Google Scholar
 4.Thorisson, H.: Coupling, Stationarity, and Regeneration. Springer, New York (2000)CrossRefGoogle Scholar
 5.Barthe, G., Espitau, T., Grégoire, B., Hsu, J., Stefanesco, L., Strub, P.Y.: Relational reasoning via probabilistic coupling. In: Davis, M., Fehnker, A., McIver, A., Voronkov, A. (eds.) LPAR 2015. LNCS, vol. 9450, pp. 387–401. Springer, Heidelberg (2015). https://doi.org/10.1007/9783662488997_27CrossRefzbMATHGoogle Scholar
 6.Barthe, G., Grégoire, B., Hsu, J., Strub, P.: Coupling proofs are probabilistic product programs. In: POPL 2017, Paris, France, 18–20 January 2017 (2017)Google Scholar
 7.Strassen, V.: The existence of probability measures with given marginals. Ann. Math. Stat. 36, 423–439 (1965)MathSciNetCrossRefGoogle Scholar
 8.Goguen, J.A., Meseguer, J.: Security policies and security models. In: IEEE Symposium on Security and Privacy, pp. 11–20 (1982)Google Scholar
 9.Bogdanov, D., Niitsoo, M., Toft, T., Willemson, J.: Highperformance secure multiparty computation for data mining applications. Int. J. Inf. Sec. 11(6), 403–418 (2012)CrossRefGoogle Scholar
 10.Cramer, R., Damgard, I.B., Nielsen, J.B.: Secure Multiparty Computation and Secret Sharing, 1st edn. Cambridge University Press, New York (2015)CrossRefGoogle Scholar
 11.Barthe, G., Espitau, T., Grégoire, B., Hsu, J., Strub, P.: Proving uniformity and independence by selfcomposition and coupling. CoRR abs/1701.06477 (2017)Google Scholar
 12.Canetti, R.: Universally composable security: a new paradigm for cryptographic protocols. In: Proceedings of Foundations of Computer Science. IEEE (2001)Google Scholar
 13.Bizjak, A., Grathwohl, H.B., Clouston, R., Møgelberg, R.E., Birkedal, L.: Guarded dependent type theory with coinductive types. In: Jacobs, B., Löding, C. (eds.) FoSSaCS 2016. LNCS, vol. 9634, pp. 20–35. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662496305_2CrossRefGoogle Scholar
 14.McBride, C., Paterson, R.: Applicative programming with effects. J. Funct. Program. 18(1), 1–13 (2008)CrossRefGoogle Scholar
 15.Birkedal, L., Møgelberg, R.E., Schwinghammer, J., Støvring, K.: First steps in synthetic guarded domain theory: stepindexing in the topos of trees. Log. Methods Comput. Sci. 8(4) (2012)Google Scholar
 16.Plotkin, G., Abadi, M.: A logic for parametric polymorphism. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 361–375. Springer, Heidelberg (1993). https://doi.org/10.1007/BFb0037118CrossRefGoogle Scholar
 17.Dreyer, D., Ahmed, A., Birkedal, L.: Logical stepindexed logical relations. Log. Methods Comput. Sci. 7(2) (2011)Google Scholar
 18.Turon, A., Dreyer, D., Birkedal, L.: Unifying refinement and Hoarestyle reasoning in a logic for higherorder concurrency. In: Morrisett, G., Uustalu, T. (eds.) ICFP 2013, Boston, MA, USA, 25–27 September 2013. ACM (2013)Google Scholar
 19.Krebbers, R., Timany, A., Birkedal, L.: Interactive proofs in higherorder concurrent separation logic. In: Castagna, G., Gordon, A.D. (eds.) POPL 2017, Paris, France, 18–20 January 2017. ACM (2017)CrossRefGoogle Scholar
 20.KroghJespersen, M., Svendsen, K., Birkedal, L.: A relational model of typesandeffects in higherorder concurrent separation logic. In: POPL 2017, Paris, France, 18–20 January 2017, pp. 218–231 (2017)Google Scholar
 21.Aczel, P., Gambino, N.: Collection principles in dependent type theory. In: Callaghan, P., Luo, Z., McKinna, J., Pollack, R., Pollack, R. (eds.) TYPES 2000. LNCS, vol. 2277, pp. 1–23. Springer, Heidelberg (2002). https://doi.org/10.1007/3540458425_1CrossRefGoogle Scholar
 22.Aczel, P., Gambino, N.: The generalised typetheoretic interpretation of constructive set theory. J. Symb. Log. 71(1), 67–103 (2006)CrossRefGoogle Scholar
 23.Grimm, N., Maillard, K., Fournet, C., Hritcu, C., Maffei, M., Protzenko, J., Rastogi, A., Swamy, N., Béguelin, S.Z.: A monadic framework for relational verification (functional pearl). CoRR abs/1703.00055 (2017)Google Scholar
 24.Crubillé, R., Dal Lago, U.: On probabilistic applicative bisimulation and callbyvalue \(\lambda \)calculi. In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 209–228. Springer, Heidelberg (2014). https://doi.org/10.1007/9783642548338_12CrossRefzbMATHGoogle Scholar
 25.Sangiorgi, D., Vignudelli, V.: Environmental bisimulations for probabilistic higherorder languages. In: Bodík, R., Majumdar, R. (eds.) POPL 2016, St. Petersburg, FL, USA, 20–22 January 2016. ACM (2016)Google Scholar
 26.Bizjak, A., Birkedal, L.: Stepindexed logical relations for probability. In: Pitts, A. (ed.) FoSSaCS 2015. LNCS, vol. 9034, pp. 279–294. Springer, Heidelberg (2015). https://doi.org/10.1007/9783662466780_18CrossRefGoogle Scholar
 27.Borgström, J., Lago, U.D., Gordon, A.D., Szymczak, M.: A lambdacalculus foundation for universal probabilistic programming. In: Garrigue, J., Keller, G., Sumii, E. (eds.) ICFP 2016, Nara, Japan, 18–22 September 2016. ACM (2016)Google Scholar
 28.Culpepper, R., Cobb, A.: Contextual equivalence for probabilistic programs with continuous random variables and scoring. In: Yang, H. (ed.) ESOP 2017. LNCS, vol. 10201, pp. 368–392. Springer, Heidelberg (2017). https://doi.org/10.1007/9783662544341_14CrossRefzbMATHGoogle Scholar
 29.Jones, C., Plotkin, G.D.: A probabilistic powerdomain of evaluations. In: LICS 1989, Pacific Grove, California, USA, 5–8 June 1989. IEEE Computer Society (1989)Google Scholar
 30.Staton, S., Yang, H., Wood, F., Heunen, C., Kammar, O.: Semantics for probabilistic programming: higherorder functions, continuous distributions, and soft constraints. In: LICS 2016, New York, NY, USA, 5–8 July 2016. ACM (2016)Google Scholar
 31.Barthe, G., Fournet, C., Grégoire, B., Strub, P., Swamy, N., Béguelin, S.Z.: Probabilistic relational verification for cryptographic implementations. In: Jagannathan, S., Sewell, P. (eds.) POPL 2014 (2014)Google Scholar
 32.Barthe, G., Gaboardi, M., Gallego Arias, E.J., Hsu, J., Roth, A., Strub, P.Y.: Higherorder approximate relational refinement types for mechanism design and differential privacy. In: POPL 2015, Mumbai, India, 15–17 January 2015 (2015)Google Scholar
 33.Barthe, G., Farina, G.P., Gaboardi, M., Arias, E.J.G., Gordon, A., Hsu, J., Strub, P.: Differentially private Bayesian programming. In: CCS 2016, Vienna, Austria, 24–28 October 2016. ACM (2016)Google Scholar
 34.Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)MathSciNetzbMATHGoogle Scholar
 35.Kellaris, G., Papadopoulos, S., Xiao, X., Papadias, D.: Differentially private event sequences over infinite streams. PVLDB 7(12), 1155–1166 (2014)Google Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.