Strong Adequacy and Untyped FullAbstraction for Probabilistic Coherence Spaces
Abstract
We consider the probabilistic untyped lambdacalculus and prove a stronger form of the adequacy property for probabilistic coherence spaces (PCoh), showing how the denotation of a term statistically distributes over the denotations of its headnormal forms.
We use this result to state a precise correspondence between PCoh and a notion of probabilistic Nakajima trees, recently introduced by Leventis in order to prove a separation theorem. As a consequence, we get full abstraction for PCoh. This latter result has already been mentioned as a corollary of Clairambault and Paquet’s full abstraction theorem for probabilistic concurrent games. Our approach allows to prove the property directly, without the need of a third model.
Keywords
LambdaCalculus Denotational semantics Probabilistic functional programming1 Introduction
Full abstraction for the maximal consistent sensible \(\lambda \)theory \(\mathcal {H}^\star \) [1] is a crucial property for a model of the untyped \(\lambda \)calculus, stating that two terms M, N have the same denotation in the model iff for every context \(C[\,]\) the headreduction sequences of C[M] and C[N] either both terminate or both diverge. The first such result was obtained for Scott’s model \(\mathcal {D}^\infty \) by Hyland [10] and Wadsworth [15]. More recently, Manzonetto developed a general technique for achieving full abstraction for a large class of models, decomposing it into the adequacy property and a notion of wellstratification [13]. An adequacy property states that the semantics of a \(\lambda \)term is different from the bottom element iff its headreduction terminates. Wellstratification is more technical, basically it means that the semantics of a \(\lambda \)term can be stratified into different levels, expressing in the model the nesting of the headnormal forms defining the interaction between a \(\lambda \)term and a context.
Our paper reconsiders these results in the setting of the probabilistic untyped \(\lambda \)calculus \(\varLambda ^+\). The language extends the untyped \(\lambda \)calculus with a barycentric sum constructor allowing for terms like \(M+_pN\), with \(p\in [0,1]\), reducing to M with probability p and to N with probability \(1p\). In recent years there has been a renewed interest in \(\varLambda ^+\) as a core language for (untyped) discrete probabilistic functional programming. In particular, Leventis proves in [12] a separation property for \(\varLambda ^+\) based on a probabilistic version of Nakajima trees, the latter describing a nesting of subprobability distributions of infinitary \(\eta \)long headnormal forms (see Sect. 5 and the examples in Fig. 2).
We consider the semantics of \(\varLambda ^+\) given by the probabilistic coherence space \(\mathcal {D}\) defined by Danos and Ehrhard in [5] and proved to be adequate in [6]. We show that the denotation \(\llbracket M\rrbracket \) in \(\mathcal {D}\) of a \(\varLambda ^+\) term M enjoys a kind of stratification property (Theorem 1, called here strong adequacy) and we use this property to prove that \(\llbracket M\rrbracket \) is a faithful description of the probabilistic Nakajima tree of M (Corollary 1). As a consequence of this result and the previously mentioned separation theorem, we achieve full abstraction for \(\mathcal {D}\) (Theorem 2), thus reconstructing in this setting Manzonetto’s reasoning for classical \(\lambda \)calculus.
Very recently, and independently from this work, Clairambault and Paquet also prove full abstraction for \(\mathcal {D}\) [2]. Their proof uses a game semantics model representing in an abstract way the probabilistic Nakajima trees and a faithful functor from this game semantics to the weighted relational semantics of [11]. The latter provides a model having the same equational theory over \(\varLambda ^+\) as the probabilistic coherence space \(\mathcal {D}\), so full abstraction for \(\mathcal {D}\) follows immediately. By the way, let us emphasise that all results in our paper can be transferred as they are to the weighted relational semantics of [11]. We decided however to consider the probabilistic coherence space model in order to highlight the correspondence between the definition of \(\mathcal {D}\) (Eq. (11)) and the definition of the logical relation (Eq. (13)) which is the key ingredient in the proof of our notion of stratification.
Let us give some more intuitions on this latter notion, which has an interest in its own. The model \(\mathcal {D}\) is defined as the limit of a chain of probabilistic coherence spaces \((\mathcal {D}_\ell )_{\ell \in \mathbb {N}}\) approximating more and more the denotation of \(\varLambda ^+\) terms. The adequacy property proven in [6] states that the probability of a term M to converge to a headnormal form is given by the mass of the semantics \(\llbracket M\rrbracket \) restricted to the subspace \(\mathcal {D}_2\) [6, Theorem 22]. The natural question is then to understand which kind of operational meaning carries the rest of the mass of \(\llbracket M\rrbracket \), i.e. the points of order greater than 2. Our Theorem 1 answers this question, showing that the semantics \(\llbracket M\rrbracket \) distributes over the semantics of its headnormal forms according to the operational semantics of \(\varLambda ^+\). By iterating this reasoning one gets a stratification of \(\llbracket M\rrbracket \) into a nesting of (\(\eta \)expanded) headnormal forms which is the key ingredient linking \(\llbracket M\rrbracket \) and the probabilistic Nakajima trees (Corollary 1).
The fact that our proof of full abstraction is based on the notion of strong adequacy makes very plausible that the proof can be adapted to a more general class of models than only probabilistic coherence spaces and weighted semantics. In particular, we would like to stress that we did not use the property of analyticity of term denotations, which is instead at the core of the proof of full abstraction for probabilistic PCFlike languages [7, 8].
Notational convention. We write \(\mathbb {N}\) for the set of natural numbers and \(\mathbb {R}_{\ge 0}\) for the set of nonnegative real numbers. Given any set X we write \(\mathcal {M}_{\text {f}}\!\left( X\right) \) for the set of finite multisets of Open image in new window an element \(m \in \mathcal {M}_{\text {f}}\!\left( X\right) \) is a function \(X \rightarrow \mathbb {N}\) such that the support of m \(\text {Supp}\left( m\right) = \{x \in X \mathrel {}m(x) > 0\}\) is finite. We write \([x_1,\dots ,x_n]\) for the multiset m such that \(m(x) = \textit{number of indices i}\,\textit{s.t.}\,x=x_i\), so [] is the empty multiset and \(\uplus \) the disjoint union. The Kronecker delta over a set X is defined for \(x,y \in X\) by: \(\delta _{x,y} = 1\) if \(x=y\), and \(\delta _{x,y} =0\) otherwise.
2 The Probabilistic Language \(\varLambda ^+\)
Example 1
Some terms useful in giving examples: the duplicator \(\mathbf {\delta } = \lambda x.xx\), the Turing fixed point combinator \(\mathbf \Theta = (\lambda xy.y(xxy))(\lambda xy.y(xxy))\) and \(\mathbf \varOmega = \mathbf \delta \mathbf \delta \).
Example 2
Let \(L=(x+_py)\), we have \(\mathrm {Red}_{\delta L,LL} = 1\), and \(\mathrm {Red}^n_{\delta L, xL}=p\), \(\mathrm {Red}^n_{\delta L, yL}=1p\) for all \(n\ge 2\). In fact both xL and yL are headnormal forms, so absorbing states. The term \(\mathbf \varOmega \) \(\beta \)reduces to itself, so \(\mathrm {Red}^n_{\varOmega ,\varOmega } = 1\) for any n, giving an example of absorbing state which is not a headnormal form.
The Turing fixed point combinator needs two \(\beta \)steps to unfold its argument, so, for any term M, \(\mathrm {Red}^{2}_{\mathbf \Theta M,M(\mathbf \Theta M)}=1\). In the case M is a probabilistic function like \(M=\lambda f.(f+_p y)\), we get \(\mathrm {Red}^{4n}_{\mathbf \Theta M,\mathbf \Theta M}=p^n\) and \(\mathrm {Red}^{4n}_{\mathbf \Theta M,y}=1p^n\), for any n. In the case \(M=\lambda f.(yf+_p y)\), we get: \(\mathrm {Red}^{4(n+1)}_{\mathbf \Theta M,y^n(\mathbf \Theta M)}=p^{n+1}\) and \(\mathrm {Red}^{4(n+1)}_{\mathbf \Theta M,y^n(y)}=(1p)p^n\), where \(y^n(...)\) denotes the nfold application \(y(\dots y(...))\).
Example 3
Recall the terms in Example 2. We have \(\mathrm {Red}^\infty _{\delta L,xL} = p\) and \(\mathrm {Red}^\infty _{\delta L,yL} = 1p\). For any \(h \in \mathrm {HNF}\) and \(n \in \mathbb {N}\) we have \(\mathrm {Red}^n_{\mathbf \varOmega ,h}=0\) so \(\mathrm {Red}^\infty _{\mathbf \varOmega ,h}=0\). The quantity \(\mathrm {Red}^\infty _{\mathbf \Theta (\lambda f.(f+_p y)),y}\) is the first example of limit, being equal to 1 whereas \(\mathrm {Red}^n_{\mathbf \Theta (\lambda f.(f+_p y)),y}<1\) for all \(n \in \mathbb {N}\). Operationally this means that the term \(\mathbf \Theta (\lambda f.(f+_p y))\) reduces to y with probability 1 but the length of these reductions is not bounded. Finally, \(\mathrm {Red}^\infty _{\mathbf \Theta (\lambda f.(yf+_p y)),y^n(y)}=(1p)p^n\), this means that \(\mathbf \Theta (\lambda f.(yf+_p y))\) converges with probability 1 but it can reach infinitely many different headnormal forms.
Given \(M,N\in \varLambda ^+\), we say that M is contextually equivalent to N if, and only if, \(\forall C[~], \sum _{h\in \mathrm {HNF}}\mathrm {Red}^{\infty }_{C[M],h}=\sum _{h\in \mathrm {HNF}}\mathrm {Red}^{\infty }_{C[N],h}\).
An important property in the following is extensionality, meaning invariance under \(\eta \)equivalence. The \(\eta \) equivalence is the smallest congruence such that, for any \(M\in \varLambda ^+\) and \(x\notin {{\,\mathrm{FV}\,}}(M)\) we have \(M =_\eta \lambda x.Mx\). Notice that the contextual equivalence is extensional (see [1] for the classical \(\lambda \)calculus).
3 Probabilistic Coherence Spaces
Girard introduced probabilistic coherence spaces (PCS) as a “quantitative refinement” of coherence spaces [9]. Danos and Ehrhard considered then the category Open image in new window of linear and Scottcontinuous functions between PCS as a model of linear logic and the cartesian closed category Open image in new window of entire functions between PCS as the Kleisli category associated with the comonad of Open image in new window modelling the exponential modality [5]. They proved also that Open image in new window provides an adequate model of probabilistic PCF and the reflexive object \(\mathcal {D}\) which is our object of study.
The two categories Open image in new window and Open image in new window have been then studied in various papers. In particular, Open image in new window is proved to be fully abstract for the callbyname probabilistic PCF [7]. This result has been also extended to richer languages, e.g. callbypushvalue probabilistic PCF [8]. The untyped model \(\mathcal {D}\) is proven adequate for \(\varLambda ^+\) [6]. This paper is the continuation of the latter result, showing full abstraction for \(\mathcal {D}\) as a consequence of a stronger form of adequacy.
We briefly recall here the cartesian closed category Open image in new window and the reflexive object \(\mathcal {D}\). Because of space we omit to consider the linear logic model Open image in new window , from which Open image in new window is derived. We refer the reader to [5, 6] for more details.
Probabilistic coherence spaces and entire functions. A probabilistic coherence space, or PCS for short, is a pair Open image in new window where Open image in new window is a countable set called the web of \({\mathcal {X}}\) and \(\mathrm {P}\!\left( \mathcal {X}\right) \) is a subset of the semimodule Open image in new window such that the following three conditions hold: (i) closedness: \(\mathrm {P}\!\left( \mathcal {X}\right) ^{\perp \perp }=\mathrm {P}\!\left( \mathcal {X}\right) \), where, given a set Open image in new window , the dual of P is defined as Open image in new window ; (ii) boundedness: Open image in new window , \(\exists \mu >0\), \(\forall x\in \mathrm {P}\!\left( \mathcal {X}\right) \), \(x_a\le \mu \); (iii) completeness: Open image in new window , \(\exists x\in \mathrm {P}\!\left( {\mathcal {X}}\right) \), \(x _a>0\).
Given \(x,y\in \mathrm {P}\!\left( \mathcal {X}\right) \), we write \(x\le y\) for the order defined pointwise, i.e. for every Open image in new window , \(x_a\le y_a\). The closedness condition is equivalent to require that \(\mathrm {P}\!\left( \mathcal {X}\right) \) is convex and Scottclosed, as stated below.
Proposition 1
(e.g. [4]). Given an index set I and a subset \(P\subset (\mathbb {R}_{\ge 0})^I\) which is bounded and complete, we have \(P=P^{\perp \perp }\) iff the following two conditions hold: (i) P is convex, i.e. for every \(x,y\in P\) and \(\lambda \in [0,1]\), \(\lambda x + (1\lambda )y \in P\); (ii) P is Scottclosed, i.e. for every \(x\le y\in P\), \(x\in P\) and for every increasing chain \(\{x_i\}_{i\in \mathbb {N}}\subseteq P\), \(\sup _ix_i\in P\).
A datatype is denoted by a PCS \(\mathcal {X}\) and its data by vectors in \(\mathrm {P}\!\left( \mathcal {X}\right) \): convexity allows for probabilistic superposition and Scottclosedness for recursion.
Example 4
A simple example of PCS is Open image in new window with Open image in new window a singleton set and \(\mathrm {P}\!\left( \mathcal {U}\right) =[0,1]\). Notice \(\mathrm {P}\!\left( \mathcal {U}\right) ^{\perp }=\mathrm {P}\!\left( \mathcal {U}\right) \). This PCS gives the flat interpretation of the unit type in a typed language. The boolean type is denoted by the two dimensional PCS \(\mathcal {B}\,{:}{:}{=}\,(\{\mathtt t, \mathtt f\}, \{(\rho _{\mathtt t},\rho _\mathtt f)\;\mid \,\rho _{\mathtt t}+\rho _{\mathtt f}\le 1\})\). Notice that \(\mathrm {P}\!\left( \mathcal {B}\right) \) can be seen as the set of the probabilistic subdistributions of the boolean values.
As soon as one consider functional types, the intuitive notion of (discrete) subprobabilistic distribution is lost. In particular, the reflexive object \(\mathcal {D}\) defined below is an example of an infinite dimensional PCS where scalars arbitrarily big may appear in \(\mathrm {P}\!\left( \mathcal {D}\right) \). One can think of PCS’s as a generalisation of the notion of discrete subprobabilistic distributions allowing a cartesian closed category.
4 Strong Adequacy
In this section we state and prove Theorem 1, enhancing the Open image in new window adequacy property given in [6]. This latter explains the computational meaning of the mass of \(\llbracket M\rrbracket \) restricted to \(\mathcal {D}_2\subseteq \mathcal {D}\), while our generalisation considers the whole \(\llbracket M\rrbracket \), showing that it encodes the way the operational semantics dispatches the mass into the denotation of the headnormal forms. As in [6], the proof of Theorem 1 adapts a method introduced by Pitts [14], consisting in building a recursively specified relation of formal approximation \(\lhd \) (Proposition 3) which satisfies the same recursive equation as \(\mathcal {D}\). However, our generalisation requires a subtler definition of \(\lhd \) with respect to [6]. In particular, we must consider open terms in order to prove Lemma 7.
The approximation relation. Let us introduce some convenient notation, extending the definition of \(\lambda \)abstraction and application to general morphisms.
Definition 1
Given Open image in new window , let \(\varLambda (v)\) be the vector Open image in new window . Given Open image in new window let \({v}\mathop {@}{u}\) be the vector Open image in new window . Finally, given a finite sequence Open image in new window , for \(n\in \mathbb {N}\), we denote by \({v}\mathop {@}{u_1\dots u_n}\) the vector \({({v}\mathop {@}{u_1})}\mathop {@}{ \dots u_n}\).
Lemma 1
The map \(v\mapsto \varLambda (v)\) is linear, i.e. for any vectors \(v, v'\) and scalars \(p,p'\in [0,1]\) such that \(p+p'\le 1\), we have \(\varLambda (p v+p' v')=p \varLambda (v)+p'\varLambda (v')\), and Scottcontinuous, i.e. for any countable increasing chain \((v_n)_{n\in \mathbb {N}}\), \(\varLambda (\sup _n(v_n))=\sup _n(\varLambda (v_n))\). The map \((v,u_1,\dots ,u_n)\mapsto {v}\mathop {@}{u_1\dots u_n}\) is Scottcontinuous on all of its arguments but linear only on its first argument v.
Proof
Scottcontinuity is because the scalar multiplication and the sum are Scottcontinuous. The linearity is because the matrices \(\mathtt {app}\), \(\mathtt {\lambda }\) are associated with linear maps (namely, they have nonzero coefficients only on singleton multisets, see (12)) as well as the leftmost component of \(\mathrm {Ev}\), see (9). \(\square \)
For any \(\varGamma \subseteq \varDelta \) there exists the projection \(\mathrm {pr}: \mathrm {P}\!\left( \mathcal {D}\right) ^\varDelta \rightarrow \mathrm {P}\!\left( \mathcal {D}\right) ^\varGamma \). Then, given a matrix \(v \in \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \) we denote by \(v\!\!\uparrow ^{\varDelta } \in \mathrm {P}\!\left( \mathcal {D}^{\varDelta } \Rightarrow \mathcal {D}\right) \) the matrix corresponding to the precomposition of the morphism associated with v with \(\mathrm {pr}\). This can be explicitly defined by, for Open image in new window , Open image in new window , \(\left( v\!\!\uparrow ^{\varDelta }\right) _{\varvec{m},d}=v_{(\varvec{m}_x)_{x \in \varGamma },d}\) if \(\forall y \in \varDelta \setminus \varGamma , \varvec{m}_y = [ ]\), and \(\left( v\!\!\uparrow ^{\varDelta }\right) _{\varvec{m},d}=0\) otherwise.
Given \((R^+,R^) \in \mathcal {P}\left( \bigcup _{\varGamma } \left( \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \times \varLambda ^+_{\varGamma }\right) \right) ^2\), we define \(\psi (R^+,R^) = (\phi (R^),\phi (R^+))\). Given two such pairs \((R^+_1,R^_1), (R^+_2,R^_2)\), we define \((R^+_1,R^_1) \sqsubseteq (R^+_2,R^_2)\) iff \(R_1^+ \subseteq R_2^+\) and \(R_1^ \supseteq R_2^\).
Lemma 2
The relation \(\sqsubseteq \) is an order relation giving a complete lattice on \(\mathcal {P}\left( \bigcup _{\varGamma } \left( \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \times \varLambda ^+_{\varGamma }\right) \right) ^2\).
Thanks to the previous lemma, we set \((\lhd ^+,\lhd )\) as the glb of the set \(\{(R^+,R^) \mathrel {}\psi (R^+,R^) \sqsubseteq (R^+,R^)\}\) of the prefixed points of \(\psi \).
Lemma 3
\(\psi (\lhd ^+,\lhd ^)=(\lhd ^+,\lhd ^)\), so \(\lhd ^+=\phi (\lhd ^)\) and \(\lhd ^=\phi (\lhd ^+)\).
Proof
One can check that \(\psi \) is monotone increasing wrt \(\sqsubseteq \), so the result follows from Tarski’s Theorem on fixed points. \(\square \)
Lemma 4
For any \(R \subseteq \bigcup _{\varGamma } \left( \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \times \varLambda ^+_{\varGamma }\right) \) and \(M \in \varLambda ^+_\varGamma \), the set \(\{v \in \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \mathrel {}(v,M) \in \phi ^\varGamma (R)\}\) contains 0, is downward closed and chain closed.
Proof
Consequence of the fact that the application \({v}\mathop {@}{u_1\dots u_n}\) and the lifting \(v\!\!\uparrow ^{\varDelta }\) are Scottcontinuous (Lemma 1). Also, \(v\!\!\uparrow ^{\varDelta }\) is linear as well as \({v}\mathop {@}{u_1\dots u_n}\) on its left argument v (always Lemma 1), so \({0\!\!\uparrow ^{\varDelta }}\mathop {@}{u_1 \dots u_n} = 0\). \(\square \)
Proposition 3
We have \(\lhd ^+ = \lhd ^\). From now on we denote it simply by \(\lhd \). We note \(\lhd ^\varGamma \) its component on \(\left( \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \right) \times \varLambda ^+_\varGamma \).
Proof
Now if \((v,M) \in \lhd ^\) then for all \(\ell \in \mathbb {N}\), \((v_{\ell },M) \in \lhd ^+\), but we have \(v = \sup _{\ell \in \mathbb {N}} v_{\ell }\) so Lemma 4 gives \((v,M) \in \lhd ^+\). \(\square \)
The key lemma. Lemma 9 is the socalled keylemma for the relation \(\lhd \). The reasoning is standard, except for the proof of Lemma 8 allowing strong adequacy.
Lemma 5
For \(M \in \varLambda ^+_{x,\varGamma }, N \in \varLambda ^+_\varGamma \), \((v,(\lambda x.M)N)\! \in \! \lhd ^\varGamma \) iff \((v,M\{N/x\})\! \in \! \lhd ^\varGamma \).
Proof
Observe that for all \(n\in \mathbb {N}\), \(N_1,\dots ,N_n\in \varLambda ^+\) and \(h \in \mathrm {HNF}\) we have \(\mathrm {Red}^\infty _{(\lambda x.M)NN_1\dots N_n,h} = \mathrm {Red}^\infty _{M\{N/x\}N_1\dots N_n,h}\). \(\square \)
Lemma 6
Let (v, M) and (r, L) in \(\lhd ^\varGamma \), then \((pv + (1p)r,M +_p L) \in \lhd ^\varGamma \).
Proof
Simply observe that for all \(h \in \mathrm {HNF}\) and \(N_1,\dots ,N_n \in \varLambda ^+\) we have \(\mathrm {Red}^\infty _{(M +_p L)N_1\dots N_n,h} = p\mathrm {Red}^\infty _{MN_1\dots N_n,h} + (1p)\mathrm {Red}^\infty _{LN_1\dots N_n,h}\). \(\square \)
Lemma 7
For all \(x \in \varGamma \), \((\mathrm {pr}^\varGamma _x,x) \in \lhd ^\varGamma \).
Proof
Lemma 8
Let \((v,M) \in \left( \mathrm {P}\!\left( \mathcal {D}^{\varGamma } \Rightarrow \mathcal {D}\right) \right) \times \varLambda ^+_\varGamma \), we have \((v,M) \in \lhd ^\varGamma \) iff for all \((r,L) \in \lhd ^\varDelta \) with \(\varDelta \supseteq \varGamma \), \(({v\!\!\uparrow ^{\varDelta }}\mathop {@}{r},M L) \in \lhd ^\varDelta \).
Proof
If \((v,M) \in \lhd ^\varGamma = \phi ^\varGamma (\lhd )\) and \((r,L) \in \lhd ^\varDelta \) then using the definition of \(\phi \) it is easy to check that \(({v\!\!\uparrow ^{\varDelta }}\mathop {@}{r},M L) \in \lhd ^\varDelta \). Conversely if for all \((r,L) \in \lhd ^\varDelta \) we have \(({v\!\!\uparrow ^{\varDelta }}\mathop {@}{r},M L) \in \lhd ^\varDelta \) and we want to prove that \((v,M) \in \phi ^\varGamma (\lhd )\) then the conditions of Eq. (13) trivially holds whenever \(n \ge 1\), so we need to consider only the case for \(n=0\).
Lemma 9
Proof
The proof is by induction on M. The abstraction uses Lemmas 5 and 8, the application uses Lemma 8 and the barycentric sum Lemma 6. \(\square \)
Theorem 1
Proof
The invariance of the interpretation by reduction (Proposition 2) gives that for all \(n \in \mathbb {N}\), \(\llbracket M\rrbracket ^\varGamma = \sum _{N \in \varLambda ^+_\varGamma } \mathrm {Red}^n_{M,N} \llbracket N\rrbracket ^\varGamma \ge \sum _{h \in \mathrm {HNF}_\varGamma } \mathrm {Red}^n \llbracket h\rrbracket ^\varGamma \). When \(n \rightarrow \infty \) we get \(\llbracket M\rrbracket ^\varGamma \ge \sum _{h \in \mathrm {HNF}_\varGamma } \mathrm {Red}^\infty _{M,h} \llbracket h\rrbracket ^\varGamma \).
Conversely using Lemma 9 with \(\varDelta = \varGamma \) and \((u_i,N_i) = (\pi ^\varGamma _{y_i},y_i)\), which is in \(\lhd ^\varGamma \) thanks to Lemma 7, we get \((\llbracket M\rrbracket ^\varGamma ,M) \in \lhd ^\varGamma \). The definition of \(\lhd = \phi (\lhd )\) with \(\varDelta = \varGamma \) and \(n = 0\) gives \(\llbracket M\rrbracket ^\varGamma \le \sum _{h \in \mathrm {HNF}_\varGamma } \mathrm {Red}^\infty _{M,h} \llbracket h\rrbracket ^\varGamma \). \(\square \)
5 Nakajima Trees and Full Abstraction
We apply our strong adequacy to infer full abstraction (Theorem 2). As mentioned in the Introduction, the bridge linking syntax and semantics is given by the notion of probabilistic Nakajima tree defined by Leventis [12] (here Definitions 2 and 3) in order to prove a separation theorem for \(\varLambda ^+\). Lemma 11 shows that the equality of Nakajima trees implies the denotational equality. The proof of this lemma uses the strong adequacy property.
Definition 2
The notation \(\bot \) represents the empty function (i.e. the distribution with empty support), encoding undefinedness and allowing directed sets of approximants.
Value Nakajima trees represent infinitary \(\eta \)long headnormal forms: up to \(\eta \)equivalence every headnormal form \(h = \lambda x_1 \dots x_n.y\,M_1\,\dots \,M_m\) is equal to \(\lambda x_1 \dots x_{n+k}.y\,M_1\,\dots \,M_m\,x_{n+1}\,\dots \,x_{n+k}\) for any \(k \in \mathbb {N}\) and \(x_{n+1}\),...,\(x_{n+k}\) fresh, and value Nakajima trees are infinitary variants of such \(\eta \)expansions.
Definition 3
Remark 1
In [12], following the definition of deterministic Nakajima trees in [1], the value tree \( VT ^\eta _{\ell +1}(\lambda x_1 \dots x_n.y\,M_1\,\dots \,M_m)\) includes explicitly the difference \(nm\). This yields a heavier but somewhat more convenient definition, as then Lemma 10 also holds for \(\ell =1\). In this paper we chose to use the lighter definition. This choice does not influence the Nakajima tree equality by Lemma 10.
Example 5
Figure 2(a) depicts some examples of value Nakajima trees associated with the headnormal form \(\lambda x_1.y(\mathbf \varOmega x_1)x_1\). Notice that these trees are equivalent to the Nakajima trees associated with \(y(\mathbf \varOmega x_1)\) as well as \(y\mathbf \varOmega \). In fact, all these terms are contextually equivalent.
Figure 2(b) shows the Nakajima tree of depth 2 associated with the term \(y(u+_qv)+_p(y'+_{p'}\mathbf \varOmega )\). Notice that the two sums \(+_p\) and \(+_{p'}\) contribute to the same subprobability distribution, whereas they are kept distinct from the sum \(+_q\) on the argument side of an application.
Figure 2(c) gives some examples of the Nakajima trees associated with the term \(\mathbf \Theta (\lambda f.(y+_{p}y(f))\), discussed also in Examples 2 and 3. Notice that the more the depth \(\ell \) increases, the more the toplevel distribution’s support grows.
It is clear that the family Open image in new window converges to a limit, but we do not need to make it explicit for our purposes, so we avoid defining the topology over Open image in new window yielding the convergence of Open image in new window .
The next lemma shows that the first levels of a \( VT ^\eta \) of a headnormal form h give a lot of information about the shape of h.
Lemma 10
Given two headnormal forms \(h=\lambda x_1\dots x_n.yM_1\dots M_m\) and \(h'=\lambda x_1\dots x_{n'}.y'M_1'\dots M_{m'}'\) and any \(\ell \ge 2\), if \( VT ^\eta _{\ell }(h)= VT ^\eta _{\ell }(h')\), then \(y=y'\) and \(nm=n'm'\).
Proof
The fact \(y=y'\) follows immediately from the definition of \( VT ^\eta \). Concerning the second equality, one can assume \(n=n'\) by \(\eta \)expanding one of the two terms, in fact \( VT ^\eta \) is invariant under \(\eta \)expansion. Modulo \(\alpha \)equivalence, we can then restrict ourselves to consider the case of \(h=\lambda x_1\dots x_n.yM_1\dots M_m\) and \(h'=\lambda x_1\dots x_n.yM_1'\dots M_{m'}'\).
Suppose, by the sake of contradiction, that \(m>m'\). Then we should have \( PT ^\eta _{\ell 1}(M_{m'+1})= PT ^\eta _{\ell 1}(x_{n+1})\), where \(x_{n+1}\) is a fresh variable, in particular \(x_{n+1}\notin {{\,\mathrm{FV}\,}}(M_{m'+1})\). Since \(\ell 1>0\), we have that \( PT ^\eta _{\ell 1}(x_{n+1})(t)=1\) only if t is equal to \(\lambda z_1z_2\dots . x_{n+1} PT ^\eta _{\ell 2}(z_1) PT ^\eta _{\ell 2}(z_2)\dots \), otherwise \( PT ^\eta _{\ell 1}(x_{n+1})(t)=0\). So, \( PT ^\eta _{\ell 1}(M_{m'+1})= PT ^\eta _{\ell 1}(x_{n+1})\) implies that \(\mathrm {Red}^{\infty }_{M_{m'+1},h}>0\) for some h having \(x_{n+1}\) as free variable, which is impossible since \(x_{n+1}\notin {{\,\mathrm{FV}\,}}(M_{m'+1})\). \(\square \)

\(\#(\star ) = 0\) for the base element,

\(\#(m\,{:}{:}\,d) = \#(m) + \#(d)\) for Open image in new window and Open image in new window ,

\(\#([d_1,\dots ,d_n]) = n + \sum _{i=1}^n \#(d_i)\) for Open image in new window ,

\(\#(\varvec{m},d) = \#(d) + \sum _{x \in \varGamma }(\#(\varvec{m}_x))\) for Open image in new window and Open image in new window .
Lemma 11
Given \(\ell \in \mathbb {N}\) and \(M, N \in \varLambda _\varGamma ^+\), if \( PT ^\eta _{\ell }(M)= PT ^\eta _{\ell }(N)\) then for any Open image in new window with \(\#(\varvec{m},d)<\ell \), we have \(\llbracket M\rrbracket ^\varGamma _{\varvec{m},d}=\llbracket N\rrbracket ^\varGamma _{\varvec{m},d}\).
Proof
We do induction on \(\ell \). If \(\ell \le 1\), then \(\#(\varvec{m},d) = 0\) implies \(d=\star \) and for every \(x\in \varGamma \), \(\varvec{m}_x=[\,]\). In this case we remark that both \(\llbracket M\rrbracket ^\varGamma _{\varvec{m},d}, \llbracket N\rrbracket ^\varGamma _{\varvec{m},d}\) are null. This in fact can be easily checked by inspecting the rules of Fig. 1, computing the matrix denoting a term by structural induction over the term.
Otherwise, by Theorem 1, we have: \(\llbracket M\rrbracket ^\varGamma _{\varvec{m},d} =\sum _{h \in \mathrm {HNF}_\varGamma } \mathrm {Red}^\infty _{M,h} \llbracket h\rrbracket ^\varGamma _{\varvec{m},d}\). This last sum can be refactored as \(\sum _{t\in VT ^\eta _{\ell }}\sum _{h\in ( VT ^\eta _{\ell })^{1}(t)}\mathrm {Red}^\infty _{M,h} \llbracket h\rrbracket ^\varGamma _{\varvec{m},d}\). A similar reasoning for N gives \(\llbracket N\rrbracket ^\varGamma _{\varvec{m},d}=\sum _{t\in VT ^\eta _{\ell }}\sum _{h\in ( VT ^\eta _{\ell })^{1}(t)}\mathrm {Red}^\infty _{N,h} \llbracket h\rrbracket ^\varGamma _{\varvec{m},d}\).

Open image in new window for any \(h,h' \in ( VT ^\eta _{\ell })^{1}(t)\), we have \(\llbracket h\rrbracket ^\varGamma _{\varvec{m},d}=\llbracket h'\rrbracket ^\varGamma _{\varvec{m},d}\).
Notice that \(\square \) implies \(\llbracket M\rrbracket ^\varGamma _{\varvec{m},d}=\llbracket N\rrbracket ^\varGamma _{\varvec{m},d}\), since the hypothesis \( PT ^\eta _{\ell }(M)= PT ^\eta _{\ell }(N)\) gives \(\sum _{h\in ( VT ^\eta _{\ell })^{1}(t)}\mathrm {Red}^\infty _{M,h}=\sum _{h\in ( VT ^\eta _{\ell })^{1}(t)}\mathrm {Red}^\infty _{N,h}\), for any \(t\in VT ^\eta _{\ell }\).
Corollary 1
Let \(M, N \!\in \! \varLambda _\varGamma ^+\), \(\forall \ell \!\in \!\mathbb N, PT ^\eta _{\ell }(M)\!=\! PT ^\eta _{\ell }(N)\) implies \(\llbracket M\rrbracket ^\varGamma \!=\!\llbracket N\rrbracket ^\varGamma \).
Theorem 2
 1.
M and N are contextually equivalent;
 2.
M and N have the same Nakajima trees;
 3.
M and N have the same interpretation in \(\mathcal {D}\).
Footnotes
 1.
In fact, this isomorphism corresponds, for I finite, to the fundamental exponential isomorphism Open image in new window of linear logic.
 2.
The elements of Open image in new window can be seen as intersection types generated from the constant \(\star \), the \(\mathop {{}{:}{:}{}}\) operation being the arrow and multisets nonidempotent intersections.
References
 1.Barendregt, H.: The LambdaCalculus: Its Syntax and Semantics. Studies in Logic and the Foundations of Mathematics, vol. 103. NorthHolland, Amsterdam (1984)zbMATHGoogle Scholar
 2.Clairambault, P., Paquet, H.: Fully abstract models of the probabilistic lambdacalculus. In: Ghica, D.R., Jung, A. (eds.) 27th EACSL Annual Conference on Computer Science Logic, CSL 2018, 4–7 September 2018, LIPIcs, Birmingham, UK, vol. 119, pp. 16:1–16:17. Schloss Dagstuhl  LeibnizZentrum fuer Informatik (2018). https://doi.org/10.4230/LIPIcs.CSL.2018.16
 3.Crubillé, R.: Probabilistic stable functions on discrete cones are power series. In: Dawar, A., Grädel, E. (eds.) Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, 09–12 July 2018, pp. 275–284. ACM (2018). https://doi.org/10.1145/3209108.3209198, http://doi.acm.org/10.1145/3209108.3209198
 4.Crubillé, R., Ehrhard, T., Pagani, M., Tasson, C.: The free exponential modality of probabilistic coherence spaces. In: Esparza, J., Murawski, A.S. (eds.) FoSSaCS 2017. LNCS, vol. 10203, pp. 20–35. Springer, Heidelberg (2017). https://doi.org/10.1007/9783662544587_2CrossRefzbMATHGoogle Scholar
 5.Danos, V., Ehrhard, T.: Probabilistic coherence spaces as a model of higherorder probabilistic computation. Inf. Comput. 209(6), 966–991 (2011)MathSciNetCrossRefGoogle Scholar
 6.Ehrhard, T., Pagani, M., Tasson, C.: The computational meaning of probabilistic coherence spaces. In: Grohe, M. (ed.) Proceedings of the 26th Annual IEEE Symposium on Logic in Computer Science (LICS 2011), pp. 87–96. IEEE Computer Society Press (2011)Google Scholar
 7.Ehrhard, T., Pagani, M., Tasson, C.: Probabilistic coherence spaces are fully abstract for probabilistic PCF. In: Sewell, P. (ed.) The 41th Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2014, San Diego, USA. ACM (2014)Google Scholar
 8.Ehrhard, T., Tasson, C.: Probabilistic call by push value (2016). http://arxiv.org/abs/1607.04690
 9.Girard, J.Y.: Between logic and quantic: a tract. In: Ehrhard, T., Girard, J.Y., Ruet, P., Scott, P. (eds.) Linear Logic in Computer Science. London Mathematical Society Lecture Note Series, vol. 316. CUP, Cambridge (2004)Google Scholar
 10.Hyland, M.: A syntactic characterization of the equality in some models for the lambda calculus. J. London Math. Soc. 12, 361–370 (1976)MathSciNetCrossRefGoogle Scholar
 11.Laird, J., Manzonetto, G., McCusker, G., Pagani, M.: Weighted relational models of typed lambdacalculi. In: 28th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2013, New Orleans, LA, USA, 25–28 June 2013. IEEE Computer Society, June 2013Google Scholar
 12.Leventis, T.: Probabilistic Böhm trees and probabilistic separation. In: Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, 09–12 July 2018, pp. 649–658 (2018). https://doi.org/10.1145/3209108.3209126
 13.Manzonetto, G.: A general class of models of \(\cal{H}^*\). In: Královič, R., Niwiński, D. (eds.) MFCS 2009. LNCS, vol. 5734, pp. 574–586. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642038167_49CrossRefzbMATHGoogle Scholar
 14.Pitts, A.M.: Computational adequacy via ‘mixed’ inductive definitions. In: Brookes, S., Main, M., Melton, A., Mislove, M., Schmidt, D. (eds.) MFPS 1993. LNCS, vol. 802, pp. 72–82. Springer, Heidelberg (1994). https://doi.org/10.1007/3540580271_3CrossRefGoogle Scholar
 15.Wadsworth, C.P.: The relation between computational and denotational properties for scott’s \(D_\infty \)models of the lambdacalculus. SIAM J. Comput. 5, 488–521 (1976)MathSciNetCrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.