1 Introduction

The theory of coalgebras encompasses a wide variety of probabilistic systems, and according notions of bisimulation and behavioural equivalence [18]. We focus on one of the most basic instances: (generative) probabilistic transition systems (PTS), consisting of a set of states X and for every state a probability distribution over next states and (explicit) termination. Formally, they are coalgebras of the form \(\alpha :X \rightarrow \mathcal {D}(A \times X + 1)\), where A is a fixed set of transition labels, \(\mathcal {D}\) the probability distribution functor and \(1=\{*\}\) a singleton, whose element we interpret as an extra ‘accepting/termination’ state (Sect. 3 for details).

There is a natural notion of finite trace semantics for such PTSs, assigning to every state a sub-probability distribution of words, as a quantitative analogue of acceptance of words in non-deterministic automata. The definition of infinite traces is more subtle: it requires assigning probability to sets of traces rather than individual traces (infinite traces often have probability zero), and to move to probability measures. It is shown in [13] how finite and infinite trace semantics arises by modelling PTSs as coalgebras in the Kleisli category of the Giry monad.

As such, the (in)finite traces semantics of PTSs is an instance of the general theory of trace semantics through Kleisli categories, as proposed in [10]. A fundamentally different way of obtaining trace semantics of coalgebras is through determinisation constructions, generalising the classical powerset construction of non-deterministic automata [12, 16] but also encompassing many other examples. In particular, in [12, 17] it is described how the finite traces of probabilistic transition systems arise through a certain determinisation construction, turning a PTS into a Moore automaton with sub-probability distributions as states. One of the advantages of determinisation is that it allows to use bisimulations (up-to) to prove trace equivalence. In particular, bisimulations up to congruence were used in Bonchi and Pous’ HKC algorithm for non-deterministic automata [4] and in its extension to weighted automata [2].

In this paper, we show that the finite and infinite trace semantics of PTS, as in [13], arises through a determinisation construction (Sect. 4). The essential underlying idea that enables this approach, is that the (in)finite traces semantics in [13] is generated basically from two kinds of finite trace semantics: those that take into account termination/acceptance (as mentioned above), and those that do not (simply the probability of exhibiting a path in the PTS). In particular, for finite PTS, our determinisation construction yields an effective procedure for proving (in)finite trace equivalence using bisimulation up to congruence, using a variation of the HKC algorithm (Sect. 5). We finally show that the determinisation construction generalises to the setting of continuous PTS, working in the category of measurable spaces and with \(\mathcal {D}\) replaced by the Giry monad (Sect. 6). While this generalises the discrete case, it is presented separately to make the discrete case accessible to a wider audience: the latter requires very little measure theory. We conclude with a discussion of related work (Sect. 7).

2 Preliminaries

Any finite set A can be called an alphabet and its elements letters. The set of words of length n with letters in A is denoted by \(A^n\). By convention \(A^0 =\{ \varepsilon \}\) where \(\varepsilon \) is the empty word. The set of finite words over A is denoted by \(A^* = \bigcup _{n\in \mathbb {N}} A^n\), the set of infinite words by \(A^\omega = A^\mathbb {N}\) and the set of all (finite and infinite) words by \(A^\infty = A^* \cup A^\omega \). A language L is a subset of \(\mathcal {P}(A^*)\). It can be seen as a function \(L :A^* \rightarrow \{0,1\}\), by setting \(L(w)=1\) iff \(w\in L\). The language derivative of L with respect to a letter a is defined by \(L_a(w)=L(aw)\). The length of \(w \in A^\infty \) is denoted by \(|w| \in \mathbb {N}\cup \{ \infty \}\). The concatenation function \(c : A^* \times A^\infty \rightarrow A^\infty \) is denoted by juxtaposition (\(c(u,v) = uv\)) and defined by \(uv(n) = u(n)\) if \(n < |u|\) and \(uv(n) = v(n-|u|)\) if \(|u|\le n < |u| + |v|\). It can be extended to languages \(\mathcal {P}(A^*) \times \mathcal {P}(A^\infty ) \rightarrow \mathcal {P}(A^\infty )\) by setting \(LM = \{ uv \mid u \in L, v \in M \}\). We sometimes abbreviate \(\{w\} M\) by wM.

Coalgebras and Moore Automata. We recall the basic definition of coalgebras, see, e.g., [11, 15] for details and examples. The only instances that we use in this paper are Moore automata (recalled below), probabilistic transition systems (Sect. 3) and measure-theoretic generalisations of both (Sect. 6). Let \(\mathcal {C}\) be a category, and \(F :\mathcal {C}\rightarrow \mathcal {C}\) a functor. An F-coalgebra consists of an object X and an arrow \(\alpha :X \rightarrow FX\). Given coalgebras \((X,\alpha )\) and \((Y,\beta )\), a coalgebra homomorphism is an arrow \(f :X \rightarrow Y\) such that \(\beta \circ f = Ff \circ \alpha \). Coalgebras and homomorphisms form a category \(\mathrm {CoAlg}(F)\). A final object in \(\mathrm {CoAlg}(F)\) is called final coalgebra; explicitly, a coalgebra \(({\varOmega },\omega )\) is final if for every F-coalgebra \((X,\alpha )\) there is unique coalgebra homomorphism \(\varphi :X \rightarrow {\varOmega }\). We recall the notion of bisimulation only for Moore automata, below.

Let B be a set. Define the machine functor \(F_B :\mathsf {Set}\rightarrow \mathsf {Set}\) by \(F_B X = B \times X^A\) and \(Ff = id_B \times f^A\). An \(F_B\)-coalgebra \(\langle o, t \rangle :X \rightarrow B \times X^A\) is called a Moore automaton (with output in B). A relation \(R \subseteq X \times X\) on the states of a Moore automaton \(\langle o, t \rangle :X \rightarrow B \times X^A\) is a bisimulation if for all \((x,y) \in R\): \(o(x) = o(y)\) and for all \(a\in A\), \((t_a(x),t_a(y))\in R\) (here, we used the classical notation \(t_a(x)\) instead of writing t(x)(a)). We write \(x \sim y\) if there exists a bisimulation R such that xRy, and in this case say that x and y are bisimilar. For every B, there exists a final \(F_B\)-coalgebra \(({\varOmega },\omega )\) where \({\varOmega }= B^{A^*}\). For an \(F_B\)-coalgebra \((X,\alpha )\), we write \(\varphi _\alpha :X \rightarrow B^{A^*}\) or simply \(\varphi \) for the unique coalgebra morphism. We think of the elements of \(B^{A^*}\) as (weighted) languages, and of \(\varphi (x)\) as the language of a state x. In particular, for \(B=2\), Moore automata are classical deterministic automata, and \(\varphi \) gives the usual language semantics. We have \(\varphi (x) = \varphi (y)\) iff \(x \sim y\), i.e., language equivalence coincides with bisimilarity.

Measure Theory. Let X be a set. A \(\sigma \)-algebra on X is a subset \({\varSigma }_X \subseteq \mathcal {P}(X)\) such that \(\emptyset \in {\varSigma }_X\) and \({\varSigma }_X\) is closed under complementation and countable union. Note that this implies that \(X \in {\varSigma }_X\) and that \({\varSigma }_X\) is closed under countable intersection and set difference. Given any subset \(G \subseteq \mathcal {P}(X)\), there always exists a smallest \(\sigma \)-algebra containing G. Indeed, \(\mathcal {P}(X)\) is a \(\sigma \)-algebra containing G, and the intersection of an arbitrary non-empty set of \(\sigma \)-algebras is itself a \(\sigma \)-algebra: just take the intersection of all \(\sigma \)-algebras containing G. We call it the \(\sigma \)-algebra generated by G and denote it by \(\sigma _X(G)\). For example, \(\mathcal {P}(X)\) is a \(\sigma \)-algebra on X. When working with real numbers \(\mathbb {R}\), we will use the Borel \(\sigma \)-algebra \(\mathcal {B}(\mathbb {R}) = \sigma _\mathbb {R} (\{ (-\infty ,x] \mid x \in \mathbb {R} \})\). We use \(\mathcal {B}([0,1]) = \{ B \cap [0,1]\mid B \in \mathcal {B}(\mathbb {R}) \}\) as the canonical \(\sigma \)-algebra on \([0,1]\). If X is a set and \({\varSigma }_X\) is a \(\sigma \)-algebra on X, the pair \((X,{\varSigma }_X)\) is called a measurable space. We write X for \((X,{\varSigma }_X)\) when the \(\sigma \)-algebra used is clear. A function \(f :(X,{\varSigma }_X) \rightarrow (Y,{\varSigma }_Y)\) is measurable if for all \(S_Y \in {\varSigma }_Y\), \(f^{-1}(S_Y) \in {\varSigma }_X\). The composition of measurable functions is measurable. An (implicitly finite) measure is a map \(m :{\varSigma }_X \rightarrow \mathbb {R}_+\) such that \(m(\emptyset ) = 0\) and \(m\left( \bigcup _{n\in \mathbb {N}} A_n\right) = \sum _{n\in \mathbb {N}} m(A_n)\) if the union is disjoint (\(\sigma \)-additivity property). We write \(\mathcal {M}(X)\) for the set of measures on a measurable space \((X,{\varSigma }_X)\).

(Sub)distribution. The distribution functor \(\mathcal {D}:\mathsf {Set}\rightarrow \mathsf {Set}\) is defined by \( \mathcal {D}(X) = \{p :X \rightarrow [0,1] \mid \sum _{x \in X} p(x) = 1\} \) and, given a function \(f :X \rightarrow Y\), \(\mathcal {D}f(u)(y) = \sum _{x\in f^{-1}(\{y\})} u(x)\). The functor \(\mathcal {D}\) extends to a monad, with the unit \(\eta \) given by the Kronecker delta \(\eta _X(x) = \delta _x\) (i.e., \(\eta (x)(y) =1\) if \(x=y\) and \(\eta (x)(y) = 0\) otherwise), and the multiplication by \(\mu _X(U)(y) = \sum _{u\in \mathcal {D}X} U(u) \cdot u(y)\). The sub-distribution functor \(\mathcal {S}:\mathsf {Set}\rightarrow \mathsf {Set}\) is defined by \(\mathcal {S}(X) = \{p :X \rightarrow [0,1] \mid \sum _{x \in X} p(x) \le 1\}\). It extends to a monad in a similar way. There is a natural embedding of \(\mathcal {D}\) in \(\mathcal {S}\), which we denote by \(\iota :\mathcal {D}\Rightarrow \mathcal {S}\).

3 Trace Semantics of Probabilistic Transition Systems

In this section, we recall PTSs and their (finite and infinite) trace semantics, following [13]. We start with the finite trace semantics.

Definition 3.1

A probabilistic transition system (PTS) is a coalgebra for the functor \(\mathcal {D}(A \times \mathrm {Id}+ 1)\), i.e., a set X together with a map \(\alpha :X \rightarrow \mathcal {D}(A \times X + 1)\).

Definition 3.2

Let \(\alpha :X \rightarrow \mathcal {D}(A \times X + 1)\) be a PTS. The finite trace semantics \(\llbracket - \rrbracket _f :X \rightarrow \mathcal {S}(A^*)\) is defined by the following equations.

$$\begin{aligned} \llbracket x \rrbracket _f (\varepsilon ) = \alpha (x)(*) \qquad \llbracket x \rrbracket _f (aw) = \sum _{y\in X} \alpha (x)(a,y) \cdot \llbracket y \rrbracket _f(w) \end{aligned}$$

for all \(x \in X\), \(a \in A\), \(w \in A^*\).

Consider as a first example the simple PTS below, where the element \(*\) is represented as a distinguished double-circled state, and a transition is represented by an arrow labeled with its probability.


We have \(\llbracket x \rrbracket _f(a^n) = \frac{1}{2^{n+1}}\) for all n. The trace semantics becomes more subtle if infinite words are also taken into account. Consider, for instance, the following PTS.


Intuitively both states accept any finite or infinite word w over \(\{a,b\}\) with probability 0. However, the probability of ‘starting with an a’ in y or z is clearly different. This becomes apparent when we move to assigning probability to sets of traces, which is where we need a bit of measure theory. We therefore first define a suitable \(\sigma \)-algebra on the set \(A^\infty \) of finite and infinite words.

Definition 3.3

Let \(S_\infty = \{\emptyset \} \cup \{ \{w\} \mid w \in A^* \} \cup \{wA^\infty \mid w \in A^* \}\). The \(\sigma \)-algebra of measurable sets of words is defined to be \({\varSigma }_{A^\infty } = \sigma _{A^\infty }(S_\infty )\).

This \(\sigma \)-algebra is generated by a countable family of generators: the empty set, the singletons of finite words, and the cones, i.e., sets \(wA^\infty \) of words that have the finite word w as a prefix. This \(\sigma \)-algebra is very natural. Indeed, the usual measure-theoretical \(\sigma \)-algebra on \(A^* \cup A^\omega \) would be the combination of the discrete \(\sigma \)-algebra \(\mathcal {P}(A^*)\) and the product \(\sigma \)-algebra (see, e.g., [1], Definition 4.42) of all \(\mathcal {P}(A)\) on \(A^\omega \). One can easily prove that this construction yields our \({\varSigma }_{A^\infty }\) too. In the sequel, this is the \(\sigma \)-algebra on \(A^\infty \) implicitly used. The following proposition establishes measurability for some useful sets.

Proposition 3.4

The following sets of words are measurable:

  1. (i)

    The singleton \(\{w\}\) for any \(w \in A^\infty \);

  2. (ii)

    any countable language;

  3. (iii)

    any language of finite words;

  4. (iv)

    \(\emptyset \), \(A^*\), \(A^\omega \), \(A^\infty \);

  5. (v)

    the concatenation LS where \(L \subseteq A^*\) and \(S \in {\varSigma }_{A^\infty }\).

In the following, if m is a measure over \(A^\infty \) and \(w \in A^\infty \), we will write m(w) instead of \(m(\{w\})\). We have the following key theorem, which follows easily from results in [13]:

Theorem 3.5

Let \(m :S_\infty \rightarrow \mathbb {R}_+\) be a map satisfying \(m(\emptyset )=0\). The two following conditions are equivalent.

  1. (i)

    There exists a unique measure \(\tilde{m} :{\varSigma }_{A^\infty } \rightarrow \mathbb {R}_+\) such that \(\tilde{m}_{|S_\infty } = m\).

  2. (ii)

    For all \(w \in A^*\), \(m(wA^\infty ) = m(w) + \sum _{a\in A} m(waA^\infty )\).


\((i)\Rightarrow (ii)\) The equation comes directly from the \(\sigma \)-additivity of \(\tilde{m}\). \((ii)\Rightarrow (i)\) According to Lemma 3.18 of [13], (ii) is equivalent to the fact that m is a pre-measure. Using Caratheodory’s extension theorem (e.g., [14]), this pre-measure can be uniquely extended to a measure as in (i). \(\square \)

Recall that \(\mathcal {M}(A^\infty )\) denotes the set of measures m on \(A^\infty \).

Definition 3.6

Let \(\alpha :X \rightarrow \mathcal {D}(A \times X + 1)\) be a PTS. The (finite and infinite) trace semantics \(\llbracket - \rrbracket :X \rightarrow \mathcal {M}(A^\infty )\) is defined by the following equations.

(These equations uniquely determine a measure by Theorem 3.5.)

Example 3.7

Consider the following PTS over the alphabet \(A = \{a\}\).

figure a

The semantics \(\llbracket x \rrbracket \) is easy to compute for sets of words in \(S_\infty \) by induction: for every finite word w, \(\llbracket x \rrbracket (w) = 0\) and \(\llbracket x \rrbracket (wA^\infty ) = 1\). Because \(a^n A^\infty \) is a non-increasing sequence of measurable sets converging to \(\{ a^\omega \}\), properties of measures yield \(\llbracket x \rrbracket (a^\omega ) = \lim _{n\rightarrow +\infty } \llbracket x \rrbracket (a^n A^\infty ) = 1\). Let us look at \(\llbracket y \rrbracket \). Intuitively, the probability of performing n loops in state y and then moving to (and staying in) state x is \(1/3^{n+1}\). Summing them for \(n\in \mathbb {N}\cup \{0\}\) gives 1 / 2, the probability of moving eventually to state x. Indeed, first observe that \(\llbracket y \rrbracket (\varepsilon ) = 1/3\) and \(\llbracket y \rrbracket (\varepsilon A^\infty ) = 1\). Let \(n \in \mathbb {N}\cup \{0\}\), then:

$$\begin{aligned}&\llbracket y \rrbracket (a^{n+1}) = \frac{1}{3} \llbracket y \rrbracket (a^n) + \frac{1}{3} \llbracket x \rrbracket (a^n) = \frac{1}{3} \llbracket y \rrbracket (a^n) \\&\llbracket y \rrbracket (a^{n+1}A^\infty ) = \frac{1}{3} \llbracket y \rrbracket (a^n A^\infty ) + \frac{1}{3} \llbracket x \rrbracket (a^n A^\infty ) = \frac{1}{3} \llbracket y \rrbracket (a^n A^\infty ) + \frac{1}{3} \end{aligned}$$

One can then prove that \(\llbracket y \rrbracket (a^n) = 1/3^{n+1}\) and \(\llbracket y \rrbracket (a^\omega ) = \lim _{n\rightarrow +\infty } \llbracket y \rrbracket (a^n A^\infty ) = \lim _{n\rightarrow +\infty } (1 + 3^{-n})/2 = 1/2\).

Example 3.8

Consider again the PTS in (2). We have \(\llbracket y \rrbracket (w) = \llbracket z \rrbracket (w) = 0\) for all \(w \in A^*\). However, \(\llbracket y \rrbracket (aA^\infty ) = \llbracket y \rrbracket (bA^\infty ) = 1/2\) whereas \(\llbracket z \rrbracket (aA^\infty ) = 3/4\) and \(\llbracket z \rrbracket (bA^\infty )=1/4\). Hence \(\llbracket y \rrbracket \ne \llbracket z \rrbracket \), as expected.

The above (in)finite trace semantics is essentially generated from two kinds of finite trace semantics: one for finite words w, and one for cones \(wA^\infty \), where \(w \in A^*\). The probability of the latter is simply the probability the finite trace w without considering acceptance/termination, i.e., the probability of exhibiting the path w. This finite presentation is exploited in the determinisation construction in the next section, which essentially encodes both kinds of finite trace semantics simultaneously.

4 Determinisation

In this section, we show how the finite and infinite trace semantics of PTS (Definition 3.6) arises through a determinisation construction. This construction transforms any PTS into a certain kind of Moore machine with sub-probability distributions as states. The final coalgebra semantics of this Moore machine represents the trace semantics \(\llbracket - \rrbracket :X \rightarrow \mathcal {M}(A^\infty )\) of the original PTS. The determinisation procedure is exploited in the next section to give an algorithm for computing (in)finite trace equivalence, based on bisimulations.

In Sect.  6, we consider a more general kind of PTS, with measurable sets as state spaces, which fully generalises the results and constructions of the current section. Most proofs in the current section are hence omitted. Moreover, it is explained in Sect.  6 that our approach is an instance of the abstract framework of coalgebraic determinisation based on distributive laws [12, 16]. In the current section we mostly neglect this and present the concrete constructions.

Throughout this section, let \(\alpha :X \rightarrow \mathcal {D}(A \times X + 1)\) be a PTS. Our approach to (in)finite traces resembles the determinisation construction of [12, 17] for finite traces of PTSs. As explained below, there is one crucial addition for (in)finite traces: we make the total weight of sub-distributions in the determinised coalgebra observable, essentially to capture the probability of the ‘cones’ \(wA^\infty \). We will show that the resulting final coalgebra semantics factorises through the set \(\mathcal {M}(A^\infty )\) of measures on words, recovering the trace semantics of Definition 3.6. The overall construction is as follows.

  1. (i)

    Translate \(\alpha \) into a coalgebra \(\tilde{\alpha } :X \rightarrow [0,1]\times [0,1]\times (\mathcal {S}X)^A\).

  2. (ii)

    Determinise it: define an \(\alpha ^\sharp :\mathcal {S}X \rightarrow [0,1]\times [0,1]\times (\mathcal {S}X)^A\) such that \(\alpha ^\sharp \circ \eta _X = \tilde{\alpha }\). Let \(\varphi :\mathcal {S}X \rightarrow ([0,1]\times [0,1])^{A^*}\) be the unique map to the final coalgebra.

  3. (iii)

    Factorise \(\varphi \) to get a coalgebra morphism \(\mathcal {S}X \rightarrow \mathcal {M}(A^\infty )\), then precompose with \(\eta _X\) to get the desired trace semantics \(X \rightarrow \mathcal {M}(A^\infty )\).

The construction is summed up in the following diagram. Below, we explain each of the steps in detail.

figure b

Remark 4.1

As mentioned above, the construction is quite close to the determinisation of finite traces [12, 17]. There are two main differences: first, the latter determinises to a Moore automaton of the type \(\mathcal {S}X \rightarrow [0,1]\times (\mathcal {S}X)^A\) (so with \([0,1]\) rather than \([0,1]\times [0,1]\)). Second, the decomposition of \(\varphi \) here yields measures (to represent (in)finite trace semantics) rather than sub-probability distributions (to represent finite trace semantics).

  • (i) Translation: from \(\alpha \) to \(\tilde{\alpha }\). The first step, the definition of \(\tilde{\alpha }\) from \(\alpha \), basically forgets certain information about probability distributions. The natural transformation \(\mathfrak {e}\) is given on a component X by \(\mathfrak {e}_X :\mathcal {S}(A \times X + 1) \rightarrow [0,1]\times [0,1]\times (\mathcal {S}X)^A\),

    $$\begin{aligned} \mathfrak {e}_X(u) = \left\langle \sum _{z \in A \times X+1} u(z) , u(*) , a \mapsto [y \mapsto u(a,y)]\right\rangle \,. \end{aligned}$$

    We have \(\tilde{\alpha }(x) = \langle 1, \alpha (x)(*), a \mapsto [y \mapsto \alpha (x)(a,y)] \rangle \).

  • (ii) Determinisation. In the second step, we turn \(\tilde{\alpha }\) into a Moore automaton over sub-distributions. Formally, the latter will be a coalgebra for the functor \(F_{[0,1]\times [0,1]} :\mathsf {Set}\rightarrow \mathsf {Set}\); recall from Sect. 2 that this is defined by \(F_{[0,1]\times [0,1]}X = [0,1]\times [0,1]\times X^A\). In the remainder of this section we abbreviate \(F_{[0,1]\times [0,1]}\) by F. Notice that \(\tilde{\alpha }\) is an \(F\mathcal {S}\)-coalgebra. Any \(F\mathcal {S}\)-coalgebra determinises to an F-coalgebra, but we spell it out here only for the necessary instance \(\tilde{\alpha }\). For a concrete example, see the first part of Example 5.9 in the next section.

Definition 4.2

The determinisation of the PTS \(\alpha :X \rightarrow \mathcal {D}(A \times X +1)\) is the Moore machine \(\alpha ^\sharp :\mathcal {S}X \rightarrow F \mathcal {S}X = [0,1]\times [0,1]\times (\mathcal {S}X)^A\), defined by:

$$\begin{aligned} \alpha ^\sharp (u) = \left\langle \sum _{x \in X} u(x), \sum _{x \in X} u(x) \cdot \alpha (x)(*), a \mapsto [y \mapsto \sum _{x \in X} u(x) \cdot \alpha (x)(a,y)] \right\rangle \,. \end{aligned}$$
  • (iii) Factorisation of final coalgebra semantics. Since \(\alpha ^\sharp \) is an F-coalgebra, there exists a unique coalgebra morphism \(\varphi \) from \((\mathcal {S}X, \alpha ^\sharp )\) to the final coalgebra \((([0,1]\times [0,1])^{A^*},\omega )\). This is not quite the right type: the (in)finite trace semantics \(\llbracket - \rrbracket \) is a (probability) measure over words, hence, for each x, \(\llbracket x \rrbracket \) should be an element of the set \(\mathcal {M}(A^\infty )\) of measures over \(A^\infty \) (Sect. 2). In the last step (iii), we equip \(\mathcal {M}(A^\infty )\) with an F-coalgebra structure \({\varPi }\) which is final among F-coalgebras satisfying a certain property, satisfied by our determinisation \(\alpha ^\sharp \). This allows us to factor \(\varphi \) through a coalgebra homomorphism \([-] :\mathcal {S}(X) \rightarrow \mathcal {M}(A^\infty )\). In the more general setting of Sect. 6 we show how the coalgebra structure on \(\mathcal {M}(A^\infty )\) arises from the Giry monad and the final coalgebra of the \(\mathsf {Set}\) endofunctor \(X \mapsto A \times X + 1\). For now, we define it explicitly, which requires:

Definition 4.3

(Measure derivative). Let m be a measure on \(A^\infty \) and \(a\in A\). The map \(m_a\) defined by \(m_a(S) = m(aS)\) for any \(S \in {\varSigma }_{A^\infty }\) is a measure, called the measure derivative of m (with respect to a).

It is easy to check that \(m_a\) as defined above is indeed a measure, so that the measure derivative is well-defined. Now, the coalgebra \({\varPi }:\mathcal {M}(A^\infty ) \rightarrow [0,1]\times [0,1]\times (\mathcal {M}(A^\infty ))^A\) is defined by \( {\varPi }:m \mapsto \langle m(\varepsilon A^\infty ), m(\varepsilon ), a \mapsto m_a \rangle \). Since \({\varPi }\) is an F-coalgebra, we obtain a coalgebra morphism to the final F-coalgebra \(\omega \).

Lemma 4.4

The unique coalgebra morphism from \({\varPi }\) to \(\omega \) is injective.

A proof is given in the more general setting of Lemma 6.8. The following crucial lemma states in which cases the factorisation is possible. It establishes the F-coalgebra \({\varPi }\) as a final object in a certain subcategory of \(\mathrm {CoAlg}(F)\).

Proposition 4.5

Let \(\beta = \langle \beta _{\oplus }, \beta _*, a \mapsto \tau _a \rangle :Y \rightarrow FY\) be an F-coalgebra. The two following conditions are equivalent:

  1. (i)

    There exists an F-coalgebra morphism \([-]\) from \(\beta \) to \({\varPi }\).

  2. (ii)

    The equation \(\beta _{\oplus } = \beta _* + \sum _{a\in A} \beta _{\oplus } \circ \tau _a\) holds.

In this case, this morphism is unique.

See Theorem 6.9 for a proof in the (more general) continuous setting.

Lemma 4.6

The coalgebra \(\alpha ^\sharp :\mathcal {S}X \rightarrow [0,1]\times [0,1]\times (\mathcal {S}X)^A\) satisfies (ii) in Proposition 4.5.

It is important to note that condition (ii) does not hold in general if the whole construction starts from a coalgebra of the form \(\alpha :X \rightarrow \mathcal {S}(A\times X +1)\). The price to be paid for a PTS to be compatible enough to generate infinite trace semantics from the finite traces in a measure-theoretic way is to sum to 1, i.e., to use \(\mathcal {D}\) and not \(\mathcal {S}\).

The following result summarises the situation.

Corollary 4.7

The morphism \(\varphi \) decomposes as a unique coalgebra morphism \([-] :\mathcal {S}X \rightarrow \mathcal {M}(A^\infty )\) from \(\alpha ^\sharp \) to \({\varPi }\) followed by an injective coalgebra morphism \(\varphi _{\varPi }\) from \({\varPi }\) to \(\omega \), as shown in the following diagram.

figure c

We thus obtain the semantics \([-] \circ \eta _X :X \rightarrow \mathcal {M}(A^\infty )\) by precomposing with the unit of the monad \(\mathcal {S}\). It coincides with the semantics \(\llbracket - \rrbracket \) of Definition 3.6:

Theorem 4.8

We have \(\llbracket - \rrbracket = [-] \circ \eta _X\).

Theorem 4.8 is the main result of this section, stating that the (in)finite trace semantics is recovered by finality through a determinisation construction. Together with Lemma 4.4, it yields equivalence between the first two points below.

Corollary 4.9

For any \(x,y \in X\), the following are equivalent:

  1. 1.

    \(\llbracket x \rrbracket = \llbracket y \rrbracket \),

  2. 2.

    \(\varphi (\delta _x) = \varphi (\delta _y)\),

  3. 3.

    \(\delta _x \sim \delta _y\),

where \(\sim \) is bisimilarity on the Moore automaton \(\alpha ^\sharp \), the determinisation of \(\alpha \) (Definition 4.2).

The equivalence between 2. and 3. is standard, and was mentioned in Sect. 2. By the equivalence between 1. and 3., we can prove (in)finite trace equivalence by computing bisimulations, which is used in the next section.

5 Computing Trace Equivalence

The aim of this section is to give an algorithm that takes states \(x,y \in X\) of a PTS and tells whether x and y are (in)finite trace equivalent (i.e., \(\llbracket x \rrbracket = \llbracket y \rrbracket \)) or not, based on the determinisation construction described in Sect. 4. Our algorithm is a variant of \(\texttt {HKC}\), an algorithm for language equivalence of non-deterministic automata based on determinisation and bisimulation (up-to) techniques [4]. More specifically, we will use its generalisation to weighted automata given in [2].

Let \(\alpha :X \rightarrow \mathcal {D}(A \times X + 1)\) be a finite-state PTS and \(\alpha ^\sharp \) its determinisation (Definition 4.2). By Corollary 4.9, to prove \(\llbracket x \rrbracket = \llbracket y \rrbracket \) it suffices to show \(\delta _x \sim \delta _y\), i.e., that there is a bisimulation \(R \subseteq \mathcal {S}X \times \mathcal {S}X\) on the determinised Moore automaton such that \((\delta _x, \delta _y) \in R\). However, this task can be simplified using bisimulation up-to techniques, as explained next. In order to use the techniques from [2], we first move from sub-probability distributions to vector spaces. To this end, define \(\mathbb {R}_\omega ^X\) as the set of finitely supported functions \(X \rightarrow \mathbb {R}\), i.e., \(\mathbb {R}_\omega ^X = \{u :X \rightarrow \mathbb {R}\mid u(x) \ne 0 \text { for finitely many }x\}\). We define the Moore automaton \(\overline{\alpha } = \langle \overline{\alpha }_{\oplus }, \overline{\alpha }_*, a \mapsto \overline{\alpha }_a \rangle :\mathbb {R}_\omega ^X \rightarrow \mathbb {R}\times \mathbb {R}\times (\mathbb {R}_\omega ^X)^A\) as follows on any \(u \in \mathbb {R}_\omega ^X\):

$$\begin{aligned} \overline{\alpha }_{\oplus } = \sum _{x \in X} u(x) \quad \overline{\alpha }_* = \sum _{x\in X} u(x) \cdot \alpha (x)(*) \quad \overline{\alpha }_a = \left[ y \mapsto \sum _{x\in X} u(x) \cdot \alpha (x)(a,y) \right] \end{aligned}$$

This is almost the same construction as in Definition 4.2, with sub-probability distributions replaced by vectors. (Note that this is well-defined since X is assumed to be finite; it would also suffice to assume that \(\alpha \) is finitely branching.) It is easy to see that the embedding \(i :\mathcal {S}X \rightarrow \mathbb {R}^X_\omega \) is an injective \(F_{[0,1]\times [0,1]}\)-coalgebra morphism from \(\alpha ^\sharp \) to \(\overline{\alpha }\). Together with Corollary 4.9, this yields:

Corollary 5.1

For any \(x,y \in X\): \(\llbracket x \rrbracket = \llbracket y \rrbracket \) iff \(\delta _x \sim \delta _y\), where \(\sim \) is bisimilarity on the Moore automaton \(\overline{\alpha }\).

We now formulate bisimulation up to congruence, concretely for \(\overline{\alpha }\).

Definition 5.2

Let \(R \subseteq \mathbb {R}_\omega ^X \times \mathbb {R}_\omega ^X\). Its congruence closure c(R) is the least congruence that contains R, i.e., that satisfies

$$\begin{aligned} \frac{(u,v) \in R}{(u,v) \in c(R)} \quad \frac{}{(u,u) \in c(R)} \quad \frac{(u,v) \in c(R)}{(v,u) \in c(R)} \quad \frac{(u,v) \in c(R) \quad (v,w) \in c(R)}{(u,w) \in c(R)} \end{aligned}$$
$$\begin{aligned} \frac{(r \cdot u, r \cdot v) \in c(R)}{(u,v) \in c(R)}(r \in \mathbb {R}) \qquad \frac{(u,u') \in c(R) \quad (v,v') \in c(R)}{(u+u',v+v') \in c(R)} \end{aligned}$$

Definition 5.3

Define \(\overline{\alpha } :\mathbb {R}_\omega ^X \rightarrow \mathbb {R} \times \mathbb {R} \times ( \mathbb {R}_\omega ^X)^A\) from a finite-state PTS \(\alpha \), as in Eq.  (3). A relation \(R \subseteq \mathbb {R}_\omega ^X \times \mathbb {R}_\omega ^X\) is a bisimulation up to congruence (on \(\overline{\alpha }\)) if for all \((u,v) \in R\):

  • \(\overline{\alpha }_{\oplus }(u) = \overline{\alpha }_{\oplus }(v)\)\(\overline{\alpha }_*(u) = \overline{\alpha }_*(v)\), and

  • \(\forall a \in A\): \((\overline{\alpha }_a(u), \overline{\alpha }_a(v)) \in c(R)\).

The following result states soundness of bisimulations up to congruence. This can either be proved from the abstract coalgebraic theory [3] or more directly using compatible functions, as in [2, 4].

Theorem 5.4

For any \(u,v \in \mathbb {R}^X_\omega \): \(u \sim v\) iff there is a bisimulation up to congruence R (on \(\overline{\alpha }\)) such that \((u,v) \in R\).

Combined with Corollary 5.1, this means that to prove that \(\llbracket x \rrbracket =\llbracket y \rrbracket \) for states xy of a PTS, it suffices to show that there is a bisimulation up to congruence relating \(\delta _x\) and \(\delta _y\). The following algorithm attempts to compute one given xy.

$$\begin{aligned} \texttt {HKC}^\infty (x,y) \end{aligned}$$
figure d

Theorem 5.5

Whenever \(\texttt {HKC}^\infty (x,y)\) terminates, it returns true iff \(\llbracket x \rrbracket = \llbracket y \rrbracket \).

Despite the fact that during the determinisation the state space always becomes infinite, the following results show that if the initial state space X is finite, then \(\texttt {HKC}^\infty \) does terminate.

Theorem 5.6

(see [6]). Let \(\mathcal {R}\) be a ring and X be a finite set. Let \(R \subseteq \mathcal {R}^X \times \mathcal {R}^X\) be a relation and let \((v,v') \in \mathcal {R}^X \times \mathcal {R}^X\) be a pair of vectors. Let \(U_R = \{ u - u' \mid (u,u') \in R \}\). Then \((v,v') \in c(R)\) iff \(v - v' \in [U_R]\), where \([U_R]\) is the submodule of \(\mathcal {R}^X\) generated by \(U_R\).

Proposition 5.7

If X is finite, \(\texttt {HKC}^\infty (x,y)\) terminates for every \(x,y \in X\).

Example 5.8

To begin with, here is a very simple PTS which we use to demonstrate the need for bisimulation up to congruence over plain bisimulations.

figure e

A bisimulation on the determinised automaton containing \((\delta _x,\delta _y)\) would require adding \((\delta _x/2^k, \delta _y/2^k)\) for all k to the relation. However, \(\texttt {HKC}^\infty (x,y)\) (which computes a bisimulation up to congruence) stops after one step because it spots that \((\delta _x/2, \delta _y/2)\) is in the congruence closure of the relation \(\{(\delta _x, \delta _y)\}\).

Example 5.9

Consider the PTS depicted on the left below. We will use \(\texttt {HKC}^\infty \) to check if the states x and z are (in)finite trace equivalent.

figure f

First, we compute part of the determinised automaton. To this end, observe that because X is finite, \(\mathbb {R}_\omega ^X = \mathbb {R}^X\) has a basis \((e_x,e_y,e_z,e_i)\). An element \(u \in \mathbb {R}_\omega ^X\) is seen as a column vector \( u_x e_x + u_y e_y + u_z e_z + u_i e_i\) in this basis. Moreover \(\overline{\alpha }_{\oplus }\) and \(\overline{\alpha }_*\) are linear forms that can be seen as the row vectors \(L_{\oplus } = \begin{pmatrix} 1&1&1&1 \end{pmatrix}\) and \(L_* = \begin{pmatrix} 1/3&2/3&1/3&0 \end{pmatrix}\), and \(\overline{\alpha }_a\) is an endomorphism with a transition matrix \(M_a\) defined by \((M_a)_{j,k} = t_a(k)(j)\). This is depicted on the right above.

We represent here two parts of the determinised automaton. The first is the path beginning with the single state x; the second is the path beginning with the single state z. Each state here has two real outputs, obtained by matrix multiplication with \(L_{\oplus }\) and \(L_*\).

figure g

Now, \(\texttt {HKC}^\infty (x,z)\) begins with \(\texttt {todo} = \{ (\eta _X(x),\eta _X(z)) \} = \{ (e_x,e_z) \}\) and \(\texttt {R} = \emptyset \). It checks that \(L e_x = L e_z\), etc. as shown in the following table.







Loop counter

(uv) extracted from todo

Check \((u,v) \in c(R)\)

Check \(Lu = Lv\)

\((M_a u , M_a v)\) added to \(\texttt {todo}\)

Cardinality of R


\(( \begin{pmatrix} 1 \\ 0 \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 0 \\ 1 \\ 0 \end{pmatrix} )\)


\(\begin{pmatrix} 1 \\ 1/3 \end{pmatrix} \,{=}\, \begin{pmatrix} 1 \\ 1/3 \end{pmatrix}\)

\(( \begin{pmatrix} 0 \\ 1/6 \\ 0 \\ 1/2 \end{pmatrix} , \begin{pmatrix} 0 \\ 0 \\ 1/3 \\ 1/3 \end{pmatrix} )\)



\(( \begin{pmatrix} 0 \\ 1/6 \\ 0 \\ 1/2 \end{pmatrix} , \begin{pmatrix} 0 \\ 0 \\ 1/3 \\ 1/3 \end{pmatrix} )\)


\(\begin{pmatrix} 2/3 \\ 1/9 \end{pmatrix} \,{=}\, \begin{pmatrix} 2/3 \\ 1/9 \end{pmatrix}\)

\(( \begin{pmatrix} 0 \\ 1/18 \\ 0 \\ 1/2 \end{pmatrix} , \begin{pmatrix} 0 \\ 0 \\ 1/9 \\ 4/9 \end{pmatrix})\)



\(( \begin{pmatrix} 0 \\ 1/18 \\ 0 \\ 1/2 \end{pmatrix} , \begin{pmatrix} 0 \\ 0 \\ 1/9 \\ 4/9 \end{pmatrix})\)











The check succeeds in loop 3 because \((u,v) \in c(R)\) according to Theorem 5.6:

$$\begin{aligned} \begin{pmatrix} 0 \\ 1/18 \\ 0 \\ 1/2 \end{pmatrix} - \begin{pmatrix} 0 \\ 0 \\ 1/9 \\ 4/9 \end{pmatrix} = \begin{pmatrix} 0 \\ 1/18 \\ -1/9 \\ 1/18 \end{pmatrix} = \frac{1}{3} \begin{pmatrix} 0 \\ 1/6 \\ -1/3 \\ 1/6 \end{pmatrix} = \frac{1}{3} \left( \begin{pmatrix} 0 \\ 1/6 \\ 0 \\ 1/2 \end{pmatrix} - \begin{pmatrix} 0 \\ 0 \\ 1/3 \\ 1/3 \end{pmatrix} \right) \end{aligned}$$

Because \(\texttt {todo}\) is eventually empty, the algorithm returns \(\texttt {true}\). Indeed, if we compute directly the measures \(\llbracket x \rrbracket \) and \(\llbracket z \rrbracket \), we can see that \(\llbracket x \rrbracket (a^n) = 1/3^{n+1}\), \(\llbracket x \rrbracket (a^\omega ) = 1/2\) and similarly for \(\llbracket z \rrbracket \). Here the bisimulation up to congruence check is necessary for termination. The construction of a bisimulation up to equivalence (dashed + dotted lines on the determinised automaton picture) would take an infinite number of steps. But the construction of the bisimulation up to congruence (dashed lines) takes only 2 steps.

6 Continuous Systems

In this section, we generalise the determinisation construction for (in)finite trace semantics previously defined to the case of continuous PTS, defined later as coalgebras for the analogue of functor \(\mathcal {D}(A\times - + 1)\) in the category \(\mathsf {Meas}\) (see [13] for examples of such PTSs). The underlying distributive law is brought to light, so that the origin of the determinisation process is better understood. The following table sums up the analogies and differences with the discrete case.


Discrete case

General case


\(\mathsf {Set}\)

\(\mathsf {Meas}\)

Usual operation

\(\sum \)

\(\int \)

Machine functor

\(FX = [0,1]\times [0,1]\times X^A\)

Measurable version of F

Probability monad

Distribution monad \(\mathcal {D}\)

Giry’s monad \(\mathbb {D}\)

Determinisation monad

Sub-distribution monad \(\mathcal {S}\)

Sub-Giry’s monad \(\mathbb {S}\)

PTS state space

Set X

Measurable space \((X,{\varSigma }_X)\)

Determinised state

Finitely supported vector

Measure (\(\le 1\))


Matrix \(t_a :X \times X \rightarrow [0,1]\)

Kernel \(t_a :X \times {\varSigma }_X \rightarrow [0,1]\)

Final F-coalgebra

\(\omega \)

Measurable version of \(\omega \)

Measure coalgebra

\({\varPi }\)

Measurable version of \({\varPi }\)

Pseudo-final morphism

\([-] :\mathcal {S}{X} \rightarrow \mathcal {M}(A^\infty )\)

\([-] :\mathbb {S}X \rightarrow \mathbb {S}A^\infty \)

In this section we work in the category \(\mathsf {Meas}\) of measurable spaces and functions. It is easy to adapt F, but considering the monads we will need some additional measure-theoretic background.

Product. Given measurable spaces \((X,{\varSigma }_X)\) and \((Y,{\varSigma }_Y)\), we define a product \(\sigma \)-algebra on \(X \times Y\) by \({\varSigma }_X \otimes {\varSigma }_Y = \sigma _{X \times Y} (\{ S_X \times S_Y \mid S_X \in {\varSigma }_X, S_Y \in {\varSigma }_Y \})\). The product of measurable spaces is then defined by \((X,{\varSigma }_X) \otimes (Y,{\varSigma }_Y) = (X \times Y, {\varSigma }_X \otimes {\varSigma }_Y)\).

Sum. Given measurable spaces \((X,{\varSigma }_X)\) and \((Y,{\varSigma }_Y)\), we define a sum \(\sigma \)-algebra on the disjoint union \(X+Y = \{ (x,0) \mid x \in X \} \cup \{ (y,1) \mid y \in Y \}\) by \({\varSigma }_X \oplus {\varSigma }_Y = \{ S_X + S_Y \mid S_X \in {\varSigma }_X, S_Y \in {\varSigma }_Y \}\). The sum of measurable spaces is then defined by \((X,{\varSigma }_X)\oplus (Y,{\varSigma }_Y) = (X+Y,{\varSigma }_X \oplus {\varSigma }_Y)\).

Given measurable spaces XY and a measurable function \(f :X \rightarrow Y\), define a new functor \(\mathfrak {L}:\mathsf {Meas}\rightarrow \mathsf {Meas}\) by \(\mathfrak {L}X = A \times X + 1\) along with its canonical \(\sigma \)-algebra \({\varSigma }_{\mathfrak {L}X} = \mathcal {P}(A) \otimes {\varSigma }_X \oplus \mathcal {P}(1)\), and \(\mathfrak {L}f = id_A \times f + id_1\). Moreover, define \(FX = [0,1]\times [0,1]\times X^A\) along with its \(\sigma \)-algebra \(\mathcal {B}([0,1]) \otimes \mathcal {B}([0,1]) \otimes \bigotimes _{a\in A} {\varSigma }_X\) and \(Ff = id_{[0,1]} \times id_{[0,1]} \times f^A\).

Integration. Let \((X,{\varSigma }_X,m)\) be a measure space and \(f :X \rightarrow \mathbb {R}\) be a measurable function. If \(f(X) = \{ \alpha _1,\ldots ,\alpha _n \}\) for some \(\alpha _1,\ldots ,\alpha _n \in \mathbb {R}_+\), then f is called a simple function and its integral can be set as \(\int _X f dm = \sum _{i=1}^n \alpha _i m(f^{-1}(\{ \alpha _i \}))\). If f is non-negative, define \(\int _X f dm = \sup \left\{ \int _X g dm \mid g \le f ,\, g\text { simple}\right\} \in [0,\infty ]\). Finally, for any \(f :X \rightarrow \mathbb {R}\), decompose \(f = f^+ - f^-\) where \(f^+ \ge 0\) and \(f^- \ge 0\). If their integrals are not both \(\infty \), define \(\int _X f dm = \int _X f^+ dm - \int _X f^- dm\). If this is finite, we say that f is m-integrable. Furthermore, for any \(S \in {\varSigma }_X\), the indicator function \(\mathbf 1 _S\) is measurable and we define \(\int _S f dm = \int _X \mathbf 1 _S f dm\).

Given a measurable function \(g :X \rightarrow Y\) and measure \(m :{\varSigma }_X \rightarrow \mathbb {R}_+\), the pushforward measure of m by g is \(m \circ g^{-1}\). For any measurable \(f :Y \rightarrow \mathbb {R}\), f is \(m \circ g^{-1}\)-integrable iff \(f \circ g\) is m-integrable and in this case, \(\int _Y f d(m\circ g^{-1}) = \int _X (f \circ g) dm\). Each positive measurable function \(X \rightarrow \mathbb {R_+}\) is the pointwise limit of an increasing sequence of simple functions. To prove some property for every positive measurable function, one can prove it for simple functions (or for indicator functions, if it is preserved by linear combinations) and show it is preserved by limits. Many such proofs use the monotone convergence theorem (see [14]), which states that if \((f_n)_{n\in \mathbb {N}}\) is an increasing sequence of positive functions with pointwise limit f, then f is measurable and \(\int _X f dm = \lim \int _X f_n dm\).

The Giry Monad. The Giry monad [8] provides a link between probability theory and category theory. In \(\mathsf {Meas}\), the Giry monad \((\mathbb {D},\eta ,\mu )\) is defined as follows. For any measurable space X, \(\mathbb {D}X\) is the space of probability measures over \((X,{\varSigma }_X)\), and \({\varSigma }_{\mathbb {D}X}\) is the \(\sigma \)-algebra generated by the functions \(e^X_S :\mathbb {D}X \rightarrow [0,1]\) defined by \(e^X_S(m) = m(S)\). For any measurable function \(g :X \rightarrow Y\), \((\mathbb {D}g)(m) = m \circ g^{-1}\). The unit is defined by \(\eta _X(x)(S) = \mathbf 1 _S(x)\) and the multiplication by \(\mu _X({\varPhi })(S) = \int _{\mathbb {D}X} e_S^X d{\varPhi }\). Similarly, one defines the sub-Giry monad \((\mathbb {S},\eta ,\mu )\), with the only difference that \(\mathbb {S}X\) is the space of sub-probability measures over \((X,{\varSigma }_X)\). There is a natural embedding of \(\mathbb {D}\) in \(\mathbb {S}\), denoted by \(\iota :\mathbb {D}\Rightarrow \mathbb {S}\).

6.1 Trace Semantics via Determinisation

The aim of this section is to define trace semantics for continuous PTS, i.e., coalgebras of the form \(\alpha :X \rightarrow \mathbb {D}(A \times X + 1)\) where X is a measurable space. We proceed in the same way as for discrete systems.

  1. (i)

    Transform \(\alpha \) into a more convenient coalgebra \(\tilde{\alpha } :X \rightarrow F \mathbb {S}X\).

  2. (ii)

    Determinise \(\tilde{\alpha }\) into an F-coalgebra \(\alpha ^\sharp :\mathbb {S}X \rightarrow F\mathbb {S}X\).

  3. (iii)

    Factorise the final morphism : \(\varphi _{\alpha ^\sharp } = \varphi _{\varPi }\circ [-]\), then precompose with \(\eta _X\).

The following diagram sums up the construction. Here \({\varSigma }_{([0,1]\times [0,1])^{A^*}}\) is the \({\varSigma }\)-algebra generated by the functions \(L \mapsto L(w)\).

figure h
  • (i) Translation: from \(\alpha \) to \(\tilde{\alpha }\)

Proposition 6.1

For any measurable space X, the function \(\mathfrak {e}_X :\mathbb {S}\mathfrak {L}X \rightarrow F \mathbb {S}X\) defined by \( \mathfrak {e}_X(m) = \langle m(\mathfrak {L}X), m(1), a \mapsto [S \mapsto m(\{a\}\times S)]\rangle \) is measurable. Moreover, \(\mathfrak {e}:\mathbb {S}\mathfrak {L}\Rightarrow F \mathbb {S}\) is a natural transformation.

Now take \(\tilde{\alpha } = \mathfrak {e}_X \circ \iota _{\mathfrak {L}X} \circ \alpha :X \rightarrow [0,1]\times [0,1]\times (\mathbb {S}X)^A\). Explicitly:

$$\begin{aligned} \tilde{\alpha }(x) = \langle \underbrace{\alpha (x)(\mathfrak {L}X)}_1, \alpha (x)(1), a \mapsto [S \mapsto \alpha (x)(\{a\} \times S)]\rangle \end{aligned}$$

We decompose it as a pairing \(\tilde{\alpha } = \langle \tilde{\alpha }_{\oplus }, \tilde{\alpha }_*, a \mapsto t_a \rangle \).

  • (ii) Determinisation. We recall some basic observations in abstract determinisation [12, 16]. By distributive law here we mean the standard notion of distributive law of monad over functor (called EM-law in [12]).

Lemma 6.2

Let \(\mathbf C \) be a category, \(F :\mathbf C \rightarrow \mathbf C \) be an endofunctor and \((T,\eta ,\mu )\) be a monad on \(\mathbf C \). Let \(f :X \rightarrow TFX\) be a TF-coalgebra and \(h :TFTX \rightarrow FTX\) be an Eilenberg-Moore T-algebra. Then there exists a unique T-algebra morphism \(f^\sharp :(TX,\mu _X) \rightarrow (FTX,h)\) such that \(f = f^\sharp \circ \eta _X\).

Lemma 6.3

With the same notations as for Lemma 6.2, and given a distributive law \(\lambda :T F \Rightarrow F T\), then \(h = F\mu _X \circ \lambda _{T X} :TFTX \rightarrow FTX\) is an Eilenberg-Moore T-algebra.

The next step is to define a distributive law \(\lambda :\mathbb {S}F \Rightarrow F \mathbb {S}\) in order to apply Lemmas 6.2 and 6.3. In the following we write \(id_{FX} = \langle \pi _X^{\oplus }, \pi _X^*, a \mapsto \pi _X^a \rangle \). Note that \(\pi ^\epsilon :F \Rightarrow [0,1]\) (for \(\epsilon \in \{*,{\oplus }\}\)) and \(\pi ^a :F \Rightarrow Id_\mathbf C \) (for \(a\in A\)) are natural transformations.

Lemma 6.4

Let \(g :\mathbb {S}([0,1]) \rightarrow [0,1]\) be defined by \(g(m) = \int _{[0,1]} id_{[0,1]} dm\). Then g is measurable and an Eilenberg-Moore \(\mathbb {S}\)-algebra.

For any object X of \(\mathsf {Meas}\), define \(\lambda _X :\mathbb {S}F X \rightarrow F \mathbb {S}X\) by

$$\begin{aligned} \lambda _X = \langle g \circ \mathbb {S}\pi _X^{\oplus }, g \circ \mathbb {S}\pi _X^*, a \mapsto \mathbb {S}\pi _X^a \rangle \end{aligned}$$

This is a measurable function because each component is measurable.

Proposition 6.5

\(\lambda :\mathbb {S}F \Rightarrow F \mathbb {S}\) is a distributive law.

Let us compute the value of our resulting determinisation. Given \(\tilde{\alpha } :X \rightarrow F\mathbb {S}X\), take \(h = F\mu _X \circ \lambda _{\mathbb {S}X}\) (Lemma 6.3) and \(\alpha ^\sharp = h \circ \mathbb {S}\tilde{\alpha }\) (Lemma 6.2). We get

$$\begin{aligned} \alpha ^\sharp&= h \circ \mathbb {S}\tilde{\alpha } \\&= F\mu _X \circ \lambda _{\mathbb {S}X}\circ \mathbb {S}\tilde{\alpha } \\&= F\mu _X \circ \langle g \circ \mathbb {S}( \pi ^{\oplus }_{\mathbb {S}X} \circ \tilde{\alpha }), g \circ \mathbb {S}(\pi ^*_{\mathbb {S}X} \circ \tilde{\alpha }), a \mapsto \mathbb {S}(\pi ^a_{\mathbb {S}X} \circ \tilde{\alpha }) \rangle \\&= \langle g \circ \mathbb {S}\tilde{\alpha }_{\oplus }, g \circ \mathbb {S}\tilde{\alpha }_*, a\mapsto \mu _X \circ \mathbb {S}t_a \rangle \end{aligned}$$

Let \(m \in \mathbb {S}X\). This more explicit expression shows that the coalgebra that arises from the determinisation is natural in the sense that the components of \(\alpha ^\sharp \) are basically obtained by integrating the information provided by \(\alpha \).

$$\begin{aligned} \alpha ^\sharp (m)&= \left\langle \int _X \tilde{\alpha }_{\oplus } dm, \int _X \tilde{\alpha }_* dm, a\mapsto \left[ S \mapsto \int _X t_a(-)(S) dm\right] \right\rangle \\&= \left\langle \int _X \alpha (-)(\mathfrak {L}X) dm, \int _X \alpha (-)(1) dm, a\mapsto \left[ S \mapsto \int _X \alpha (-)(\{a\}\times S) dm \right] \right\rangle \end{aligned}$$
  • (iii) Final coalgebra. This heavy determinisation part gives us an F-coalgebra \(\alpha ^\sharp \). There exists a final object in \(\mathrm {CoAlg}(F)\):

Proposition 6.6

Let \({\varOmega }= ([0,1]\times [0,1])^{A^*}\) and \({\varSigma }_{\varOmega }\) be the smallest \(\sigma \)-algebra that makes the functions \(e_w :{\varOmega }\rightarrow [0,1]\times [0,1]\) defined by \(e_w(L) = L(w)\) measurable for every \(w \in A^*\). Let \(\omega :{\varOmega }\rightarrow F{\varOmega }\) be defined by \(\omega (L) = \langle L(\varepsilon ), a \mapsto L_a \rangle \). Then \(({\varOmega },\omega )\) is a final F-coalgebra.

Thus for any F-coalgebra \(\beta \) the final morphism towards \(\omega \), denoted \(\varphi _\beta \), gives a canonical notion of semantics. What we want is something slightly more specific that takes into account the way \(\alpha ^\sharp \) was built to produce a probability measure in \(\mathbb {S}A^\infty \). This is obtained via the coalgebra \({\varPi }:\mathbb {S}A^\infty \rightarrow F\mathbb {S}A^\infty \), built as follows.

Proposition 6.7

Let \(\pi :A^\infty \rightarrow \mathfrak {L}A^\infty \) be defined by \(\pi (\varepsilon ) = *\) and \(\pi (aw) = (a,w)\). This is a final \(\mathfrak {L}\)-coalgebra.

Let \({\varPi }= \mathfrak {e}_{A^\infty } \circ \mathbb {S}\pi \). One can check that with this definition, \({\varPi }: \mathbb {S}A^\infty \rightarrow F \mathbb {S}A^\infty \) has the same expression as the \({\varPi }: \mathcal {M}(A^\infty ) \rightarrow F\mathcal {M}(A^\infty )\) of Sect. 4:

$$\begin{aligned} {\varPi }(m)&= \langle m(\pi ^{-1}(\mathfrak {L}A^\infty )), m(\pi ^{-1}(1)), a \mapsto [S \mapsto m(\pi ^{-1}(\{a\} \times S))] \rangle \\&= \langle m(A^\infty ), m(\varepsilon ), a \mapsto m_a \rangle \end{aligned}$$

The aim is now to factorise the semantics obtained via \(\omega \) into semantics obtained via \({\varPi }\). The following result is a kind of completeness property for this operation.

Lemma 6.8

The final morphism \(\varphi _{\varPi }\) from \({\varPi }\) to \(\omega \) is injective.


For any \(m, m' \in \mathbb {S}A^\infty \), in order to have \(m=m'\), it is sufficient that \(m_{|S_\infty } = m'_{|S_\infty }\) (see Theorem 3.5). By induction on w, we prove that for \(m,m' \in \mathbb {S}A^\infty \) such that \(\varphi _{\varPi }(m) = \varphi _{\varPi }(m')\), then \(\langle m(wA^\infty ),m(w) \rangle = \langle m'(wA^\infty ), m'(w) \rangle \). First, \(\langle m(\varepsilon A^\infty ), m(\varepsilon ) \rangle = \varphi _{\varPi }(m)(\varepsilon ) = \varphi _{\varPi }(m')(\varepsilon ) = \langle m'(\varepsilon A^\infty ), m'(\varepsilon ) \rangle \). Note that \(\varphi _{\varPi }(m) = \varphi _{\varPi }(m')\) implies \(\varphi _{\varPi }(m_a)(w) = \varphi _{\varPi }(m)(aw) = \varphi _{\varPi }(m')(aw) = \varphi _{\varPi }(m'_a)(w)\) so that \(\varphi _{\varPi }(m_a) = \varphi _{\varPi }(m'_a)\). Use the induction hypothesis to obtain that \(\langle m(awA^\infty ), m(aw) \rangle = \langle m_a(wA^\infty ), m_a(w) \rangle = \langle m'_a(wA^\infty ), m'_a(w) \rangle = \langle m'(awA^\infty ), m'(aw) \rangle \). This achieves the induction, so m and \(m'\) coincide on \(S_\infty \), hence \(m=m'\). \(\square \)

The following proposition states precisely in which cases the factorisation can be done. This is a variant of Theorem 3.5 in which we really see that our system is making one step. This version is stronger than Proposition 4.5, because it also proves that the involved functions are measurable.

Theorem 6.9

Let \(\beta = \langle \beta _{\oplus }, \beta _*, a \mapsto \tau _a \rangle :Y \rightarrow F Y\) be an F-coalgebra. The two following conditions are equivalent:

  • (i) There exists an F-coalgebra morphism \([-]\) from \(\beta \) to \({\varPi }\).

  • (ii) The equation \(\beta _{\oplus } = \beta _* + \sum _{a\in A} \beta _{\oplus } \circ \tau _a\) holds.

In this case, this morphism is unique.

For convenience we denote \(e_S^{A^\infty } \circ [-]\) by \([-](S)\), and \(\phi _a \circ [-]\) by \([-]_a\), where the measure derivative function \(\phi _a :m \mapsto m_a\) is measurable as a component of \({\varPi }\).


\((i) \Rightarrow (ii)\). Suppose \([-]\) is a coalgebra morphism from \(\beta \) to \({\varPi }\). Commutation of the diagram yields \(\langle \beta _{\oplus }, \beta _*, a \mapsto [-] \circ \tau _a \rangle = \langle [-](A^\infty ), [-](\varepsilon ), a \mapsto [-]_a \rangle \). Let \(y \in Y\). Because \([y]\) is a measure, \(\beta _{\oplus }(y) = [y](\varepsilon A^\infty ) = [y] (\varepsilon ) + \sum _{a\in A} [y] (aA^\infty )\). Thus \(\beta _{\oplus }(y) = \beta _* (y) + \sum _{a\in A} [\tau _a(y)](A^\infty ) = \beta _*(y) + \sum _{a\in A} (\beta _{\oplus } \circ \tau _a) (y)\).

Uniqueness. If \([-]'\) is another such morphism, we have \([-](A^\infty ) = [-]'(A^\infty )\), \([-](\varepsilon ) = [-]'(\varepsilon )\) and for any \(a \in A\), \([-] \circ \tau _a = [-]_a\) and \([-]' \circ \tau _a = [-]'_a\). An immediate induction yields \([-]_{|S_\infty } = [-]'_{|S_\infty }\), thus \([-] = [-]'\) by Theorem 3.5.

\((ii)\Rightarrow (i)\) Assume that (ii) holds. Let us define \([-]\) on \(S_\infty \) by induction:

We must prove that it can be extended to a measure, using Theorem 3.5. First, note that \([y]_{|S_\infty } (\varepsilon A^\infty ) = \beta _{\oplus }(y) = \beta _*(y) + \sum _{a\in A} (\beta _{\oplus } \circ \tau _a)(y) = [y]_{|S_\infty } (\varepsilon ) + \sum _{a\in A} [y]_{|S_\infty } (aA^\infty )\). If it is known that for all \(y\in Y\), \([y]_{|S_\infty } (wA^\infty ) = [y]_{|S_\infty } (w) + \sum _{a\in A} [y]_{|S_\infty } (waA^\infty )\) then for any \(b\in A\) we obtain the equation \([y]_{|S_\infty } (bwA^\infty ) = [\tau _b(y)]_{|S_\infty } (wA^\infty ) = [\tau _b(y)]_{|S_\infty } (w) + \sum _{a \in A} [\tau _b(y)]_{|S_\infty } (waA^\infty ) = [y]_{|S_\infty } (bw) + \sum _{a\in A} [y]_{|S_\infty } (bwaA^\infty )\). This proves the (ii) of Theorem 3.5. We denote by \([-]\) the extension of \([-]_{|S_\infty }\). We postpone the proof of the measurability of \([-]\); what is left is the commutation of the coalgebra diagram. The first line of the definition of \([-]_{|S_\infty }\) gives directly that \(\beta _{\oplus } = [-](A^\infty )\) and \(\beta _* = [-](\varepsilon )\). Let \(a\in A\). For any \(y \in Y\), according to the second line of the definition of \([-]_{|S_\infty }\), the measures \([\tau _a(y)]\) and \([y]_a\) coincide on \(S_\infty \), hence are equal according to Theorem 3.5, so \([-] \circ \tau _a = [-]_a\). This achieves the proof that the diagram commutes.

Measurability. It is not immediate to see why \([-] :Y \rightarrow \mathbb {S}A^\infty \) is a measurable function. What has to be shown is that for any \(S \in {\varSigma }_{A^\infty }\), \([-](S)\) is measurable. This is true when \(S \in S_\infty \). Indeed, \([-](\emptyset )\) is the zero function, which is measurable. For the rest we proceed by induction. Obviously \([-](\varepsilon A^\infty ) = \beta _{\oplus }\) and \([-](\varepsilon ) = \beta _*\) are measurable because \(\beta \) is. Furthermore, \([-](awA^\infty ) = [-]_a(wA^\infty ) = [-](wA^\infty ) \circ \tau _a\) and \([-](aw) = [-]_a(w) = [-](w) \circ \tau _a\) are measurable by induction hypothesis and composition.

We need to introduce a widely known theorem of measure theory, namely the \(\pi -\lambda \) theorem (see [1], Lemma 4.11). Let Z be a set. A set \(P \subseteq \mathcal {P}(Z)\) is a \(\pi \)-system if it is non-empty and closed under finite intersections. A set \(D \subseteq \mathcal {P}(Z)\) is a \(\lambda \)-system if it contains Z and is closed under difference (if \(A,B \in D\) and \(A\subseteq B\) then \(B\setminus A \in D\)) and countable increasing union. The \(\pi -\lambda \) theorem states that given P a \(\pi \)-system, D a \(\lambda \)-system such that \(P \subseteq D\), then \(\sigma _Z(P) \subseteq D\).

Take \(Z = A^\infty \), \(P = S_\infty \) and \(D = \{ S \in {\varSigma }_{A^\infty } \mid [-](S) \text { is measurable} \}\). It is easy to see that \(S_\infty \) is a \(\pi \)-system. Moreover, D is a \(\lambda \)-system. Indeed, \(A^\infty \in D\) (see above), if \((S_n)_{n\in \mathbb {N}}\) is an increasing sequence of sets in D, then \([-](S_1 \setminus S_0) = [-](S_1) - [-](S_0)\) is measurable as a difference of measurable functions and \([-] \left( \bigcup _{n\in \mathbb {N}} S_n\right) = \lim _{n\rightarrow \infty } [-] (S_n)\) is measurable as a pointwise limit of measurable functions. Finally, given the preceding paragraph, we have \(S_\infty \subseteq D\). The \(\pi -\lambda \) theorem therefore yields \({\varSigma }_{A^\infty } \subseteq D\). Thus \([-]\) is measurable. \(\square \)

An interpretation of the last proposition is that, in the subcategory of all F-coalgebras that satisfy the equation (ii), the final object is \({\varPi }\). If Theorem 6.9 holds, then note that \(\varphi _{{\varPi }} \circ [-]\) is a coalgebra morphism from \(\beta \) into the final coalgebra \(\omega \). Hence by finality \(\varphi _{{\varPi }} \circ [-] = \varphi _{\beta }\). Along with Lemma 6.8, this yields the following proposition, which is exactly the same as in Sect. 2.

Proposition 6.10

Let \(\beta :Y \rightarrow FY\) be an F-coalgebra for which Theorem 6.9 holds. Then for any \(y,z \in Y\), \([y] = [z]\) iff \(\varphi _\beta (y) = \varphi _\beta (z)\).

Back to \(\alpha :X \rightarrow \mathbb {D}\mathfrak {L}X\) we check that Theorem 6.9 holds for \(\alpha ^\sharp = \langle \alpha ^\sharp _{\oplus },\alpha ^\sharp _*, a \mapsto \tau _a \rangle \). Note that because \(\alpha (-)(\mathfrak {L}X) = 1\), we have for all \(m \in \mathbb {S}X\) that \(m(X) = \int _X 1dm = \int _X \alpha (-)(\mathfrak {L}X) dm = \alpha ^\sharp _{\oplus }(m)\). This justifies the last equality:

$$\begin{aligned} \alpha ^\sharp _{\oplus }(m)&= \int _X \alpha (-)(\mathfrak {L}X) dm = \int _X \left( \alpha (-)(1) + \sum _{a\in A} \alpha (-)(\{a\} \times X) \right) dm \\&= \int _X \alpha (-)(1) dm + \sum _{a\in A} \int _X \alpha (-)(\{a\} \times X) dm \\&= \alpha ^\sharp _*(m) + \sum _{a\in A} \tau _a(m)(X) = \alpha ^\sharp _*(m) + \sum _{a\in A} (\alpha _{\oplus }^\sharp \circ \tau _a)(m) \end{aligned}$$

Conclusion. Any \(\alpha :X \rightarrow \mathbb {D}\mathfrak {L}X\) can be given a canonical trace semantics via a determinisation process. This is a function \([-] :\mathbb {S}X \rightarrow \mathbb {S}A^\infty \).

6.2 Correctness of the Resulting Trace Semantics

In [13], given a PTS \(\alpha :X \rightarrow \mathbb {D}\mathfrak {L}X\), the trace semantics \(\llbracket - \rrbracket :X \rightarrow \mathbb {D}A^\infty \) (denoted by \(\mathbf tr \) in [13]) is defined by

We will hereby prove that this semantics fits with ours, in the sense that the following diagram commutes.


Lemma 6.11

For any \(m \in \mathbb {S}X\) and any \(S\in S_\infty \), \([m](S) = \int _X ([-] \circ \eta _X)(S) dm\).


In this proof, \(\int _X f dm\) may be denoted by \(\int _{x\in X} f(x) m(dx)\). One can show using the monotone convergence theorem that for any measurable function \(f :X \rightarrow [0,1]\),

$$\begin{aligned} \int _X f d\tau _a (m) = \int _{x \in X} \left( \int _X f d t_a (x) \right) m(dx) \end{aligned}$$

Note further that \([\eta _X(x)] (\varepsilon A^\infty ) = (\alpha ^\sharp _{\oplus } \circ \eta _X)(x) = \tilde{\alpha }_{\oplus }(x) = \alpha (x)(\mathfrak {L}X)\) and in the same way \([\eta _X(x)] (\varepsilon ) = \alpha (x)(1)\). Now let us prove the lemma by induction, for all \(m\in \mathbb {S}X\). First

$$\begin{aligned}&[m](\varepsilon A^\infty ) = \alpha ^\sharp _{\oplus }(m) = \int _X \alpha (-)(\mathfrak {L}X) dm = \int _X ([-] \circ \eta _X) (\varepsilon A^\infty ) dm \\&[m](\varepsilon ) = \alpha ^\sharp _*(m) = \int _X \alpha (-)(1) dm = \int _X ([-] \circ \eta _X) (\varepsilon ) dm \end{aligned}$$

Assume the result is true for \(wA^\infty \) and w. Take \(\diamond \in \{\{\varepsilon \},A^\infty \}\).

$$\begin{aligned}{}[m] (aw\diamond )&= [\tau _a(m)](w\diamond ) = \int _X ([-] \circ \eta _X) (w\diamond ) d\tau _a(m)&\text {(induction hypothesis)} \\&= \int _{x\in X} \left( \int _X ([-] \circ \eta _X) (w\diamond ) dt_a(x) \right) m(dx)&\text {(preliminary remark)} \\&= \int _{x \in X} [\tau _a(\eta _X(x))](w\diamond ) m(dx)&\text {(definition of }\tau _a) \\&= \int _X ([-] \circ \eta _X) (aw\diamond ) dm \end{aligned}$$

\(\square \)

Using this last lemma and that \(\tau _a \circ \eta _X = t_a\), we have for any \(x\in X\):

Thus, for any \(x\in X\), \(([-] \circ \eta _X)(x)\) and \((\iota _{A^\infty } \circ \llbracket - \rrbracket )(x)\) are measures in \(\mathbb {S}A^\infty \) that coincide on \(S_\infty \). Because of Theorem 3.5, they are equal. Consequently, the trace semantics we get via determinisation and Eilenberg-Moore algebras is the same as the Kleisli trace semantics of [13].

Theorem 6.12

The diagram (4) commutes, i.e., the maps \(\iota _{A^\infty } \circ \llbracket - \rrbracket \) and \([-] \circ \eta _X\) coincide.

Finally, note that, in the event that \(\alpha :X \rightarrow \mathbb {D}\mathfrak {L}X\) can be seen as a discrete system, i.e., for all \(x \in X\), \(\alpha (x)\) is a convex countable sum of Dirac distributions, then the general semantics coincide with those obtained in Sect. 2.

7 Related Work

The (in)finite trace semantics of PTS discussed in this paper was presented coalgebraically in [13], through the Kleisli category of the (sub-)Giry monad. By using a determinisation construction, we obtain the same trace semantics, in a fundamentally different way. This determinisation construction is precisely what allows us to use bisimulations (up-to) to prove equivalence. Further, our determinisation construction can be presented separately for the discrete and continuous cases (the discrete case is much more basic), whereas in the Kleisli setting only the general continuous case can be presented (since discrete systems generate a probability measure). Other coalgebraic approaches to infinite traces (based on fixed points, e.g., [7, 19]) do not use determinisation.

Our determinisation construction for (in)finite traces is strongly inspired by the one for finite traces in [12, 17]. As explained in Sect. 4, the main technical difference is that the total probability mass of states in the determinised Moore automaton becomes observable, and that this yields a probability measure over sets of traces rather than a (sub)probability distribution over individual traces.

The above-mentioned equivalence between the determinisation and ‘Kleisli’ trace semantics for finite traces is a motivating example for the general comparison between coalgebraic determinisation and Kleisli traces in [12]. However, we do not know if those results can be applied here for at least one reason: the correspondence stated in [12] uses only one monad for both constructions, using, in case of finite traces, an extension natural transformation of the form \(\mathfrak {e}:\mathbb {S}\mathfrak {L}\Rightarrow F \mathbb {S}\) (actually, the discrete version). However, in our construction, we have to move from probability measures in the definition of PTSs (modeled by \(\mathbb {D}\)) to sub-probability measures in the determinised Moore automaton (modeled by \(\mathbb {S}\)). In contrast to the case of finite traces, we can not simply replace \(\mathbb {D}\) by \(\mathbb {S}\) in the definition of PTS, since the sums-to-1 condition is required for the condition (ii) of Theorem 6.9. One might try to nevertheless use only \(\mathbb {S}\) as the monad, focusing on PTSs (involving \(\mathbb {S}\)) that satisfy the sums-to-one condition. But it is currently unclear to us how such a subclass fits into the framework of [12]; moreover, the Kleisli semantics for PTSs based on \(\mathbb {S}\) is finite traces [13, Theorem 3.33]. Another idea is to use the isomorphism \(\mathbb {D}(A\times X +1)\simeq \mathbb {S}(A \times X)\), (via the map \(m \mapsto m_{|{{\varSigma }_A \otimes {\varSigma }_X}}\)) but this does not seem to solve the issue: the Kleisli semantics of PTS of the form \(X \rightarrow \mathbb {S}(A\times X)\) is trivial [13, Theorem 3.33]. We leave a suitable extension of the abstract framework [12] for future work.

For the algorithm presented in Sect.  5, we embed convex combinations (in the transition structure of PTS) into vector spaces, in order to use a more general contextual closure, w.r.t. arbitrary linear combinations rather than only convex combinations. This guarantees termination of the algorithm based on bisimulation up to congruence. We do not know whether this move is really necessary: perhaps the contextual closure w.r.t. only convex combinations suffices. The recent [5] might be of use in answering this question.

This work was done primarily from a coalgebraic point of view. Actually, as pointed out by one of the reviewers, the determinization of a PTS involves to a standard construction in the theory of Markov chains and stochastic processes: the passage from a kernel to a stochastic operator. This perspective could be investigated further. Notably, one motivation for trying to do so is to study how the results of Sect. 5 could extend to (discrete approximations) of the measurable PTSs of Sect. 6.