1 Introduction

In the setting of concurrent and distributed systems, choreographic languages are used to define interaction protocols that communicating processes should abide by [32, 43, 46]. These languages are akin to the “Alice and Bob” notation found in security protocols, and inherit the key idea of making data communication manifest in programs [42]. This is usually obtained through a linguistic primitive like , read “ communicates the result of evaluating expression to , which stores it in its local variable ”.

In recent years, the communities of concurrency theory and programming languages have been prolific in developing methodologies based on choreographies, yielding results in program verification, monitoring, and program synthesis [2, 31]. For example, in multiparty session types, types are choreographies used for checking statically that a system of processes implements protocols correctly [30]. Further, in choreographic programming, choreographic languages are elevated to full-fledged programming languages [40], which can express how data should be pre- and post-processed by processes (encryption, validation, anonymisation, etc.).

Choreographic programming languages come with a procedure known as Endpoint Projection (EPP), which automatically synthesises executable code for each process described in a choreography, with the guarantee that executing these processes together implements the communications prescribed in the choreography [7, 8]. These languages showed promise in a number of contexts, including parallel algorithms [11], cyber-physical systems [28, 37, 38], self-adaptive systems [23], system integration [27], information flow [36], and the implementation of security protocols [28].

EPP involves three elements: the source choreographic language, the target process language, and the compiler. The interplay between these components, where a single instruction at the choreographic level might be implemented by multiple instructions in the target language, makes the theory of choreographic programming error-prone: for even simpler approaches, like abstract choreographies without computation, it has been recently discovered that a few proofs published in peer-reviewed articles do not hold and their theories required adjustments. While in most cases these adjustments amounted to correcting small details in proofs or deal with missing cases, there were situations that required finding new proof strategies or reformulating statements [39, 45]. In exceptional situations, it has been discovered that key results actually did not hold [3,4,5, 25, 35].

This article presents a formalisation of a core theory of choreographic programming in the theorem prover Coq, the process of developing this formalisation, the challenges encountered, and how tackling these challenges led to improvements of the original theory.

A note on the process. We argue that computer-aided verification can be successfully applied to the study of choreographies and to provide solid foundations for future developments. To substantiate this claim, we summarise the story behind this article, which illustrates how interactive theorem proving can do more than just checking what we already know.

Our starting point was the theory of Core Choreographies (CC), a minimalistic language that the first two authors previously proposed for the study of choreographic programming [16]. CC includes only the essential features of choreographic languages and minimal computational capabilities at processes (computing the successor of a natural number and deciding equality of two natural numbers), yet it is expressive enough to be Turing complete.

We started formalising CC in Coq in late 2018. In mid-2019, we gave an informal progress report on the promising status of the formalisation at the TYPES conference [19]. Unfortunately, we soon stumbled upon an unexpected source of complexity for the formalisation: a set of term-rewriting rules for a precongruence relation used in the semantics of the language for (i) expanding procedure calls and (ii) reshuffling independent communications to model concurrent execution. In addition to being time consuming, reasoning with precongruence systematically made the formalisation significantly more complicated than the development in [16] (for a more technical discussion, see Sect. 3.5).

At the time, the second author was responsible for a Master course on theory of choreography for students in Computer Science. It quickly became apparent that the technical aspects (including, but not only, structural precongruence) that complicated the formalisation of CC were also the most challenging for the students. This observation led that author to develop an alternative theory of CC for his course material that dispenses with these problematic notions without changing its essence [41]. The formalisation in this article uses this revised choreography theory.

Thus, our work also shows that theorem proving can be used in research: the insights obtained while doing this formalisation led to changes in the original theory. We show that this did not come at the cost of expressive power: the original proof of Turing completeness from [16] still works for the theory in [41] without essential changes [21]. Furthermore, formalising the theory also allowed us to identify unnecessary assumptions in some lemmas, yielding stronger results.

Publication history and contribution. As mentioned previously, a first informal progress report on this formalisation was presented at the TYPES conference in 2019 [19], following an approach that later turned to be unfeasible. The first formalisation of the choreographic language, including the proof of Turing completeness, was presented in [21], while the formalisation of EPP appeared originally in [20]. The current presentation discusses an updated formalisation, which (i) no longer uses Coq’s module system and (ii) differs significantly in the treatment of partial functions, which significantly simplifies the definition of EPP. We do not discuss the formalisation of the proof of Turing completeness, as this is essentially unchanged from [21]. Instead, we place a stronger emphasis on the formalisation challenges compared to the works cited.

The direct result of our work is a formalisation that can be used as a basis for future work on choreographic programming, both in theory and in practice. Subsequent developments already include a formalisation of choreography repair [17], a more flexible notion of projection allowing for livelocks [15], and a toolchain for generating executable code from choreographies [14]. These developments capitalise on the current contribution in different ways, showing that the current formalisation is reusable, extendable, and amenable to be incorporated in tools for software development.

Furthermore, our formalisation dispels any concerns that there may be regarding the correctness of our results—which is especially relevant in an area where many proofs are extremely technical and tedious both to write down and to check in detail.

Lastly, we provide more evidence to substantiate the general claim that interactive theorem proving is a valid tool for conducting research in theoretical computer science, by showing that formalising a state-of-the-art theoretical development is feasible and can provide valuable insights that help improve the theoretical development.

The big picture. This work is the first step towards a more ambitious goal: the development of a certified framework for choreographic programming. At a later stage, we plan on developing compilers that can translate the process implementations generated by EPP into executable code in different programming languages (see Fig. 1). This would yield end-to-end compilation from choreographies to actual executable code.

Fig. 1
figure 1

Two-stage compilation process from choreographies to executable code

Our goal motivated two important design choices in the current work that are not present in [20, 21]. First, we want to extract a correct implementation of EPP from our formalisation.

This motivated us to move away from Coq’s module system, as we found the Haskell code generated by extraction to be unidiomatic. Second, we introduce the possibility of annotating terms in choreographies with data that may be needed for (second-stage) compilation to executable programming languages.

Our language has two characteristics that are inherited from the choreographic model in [16]. First, the semantics that we present in Sect. 3.3 is synchronous. This is a standard choice for choreographic models, as it makes the development simpler; the interested reader is referred to [12] for a lengthier discussion on asynchronous models. The second choice is that we assume a fixed set of labels with only two elements (see Sect. 3.2). This is again a standard choice in theoretical developments, as any label from a fixed finite set of labels can be encoded by a sequence of labels from a two-element set, but it leads to inefficiency in practical applications. Formalising a more general theory in Coq poses complex technical challenges, as discussed in Sect. 5.5.

Structure. A full understanding of the more technical details of our formalisation benefits from some background knowledge on choreographies. For convenience, Sect. 2 features a short introduction to the main intuitions and results of choreography theory, which can be skipped by readers familiar with the topic. Our choreographic language (syntax and semantics) is presented together with its Coq formalisation in Sect. 3, where it is also shown that it enjoys the usual properties of choreographic languages. Section 4 defines the target process language, together with its semantics. EPP is formalised in Sect. 5, and its soundness and completeness are discussed in Sect. 6. We review related formalisation efforts in Sect. 7, before concluding in Sect. 8.

Our development was made using Coq 8.13.2. The source code is available at [22].

2 Background: Choreographic Languages and Endpoint Projection

In this section we describe the language of Simple Choreographies [41], which introduces the basic principles of choreographies and EPP. We include this material to make our development accessible to the reader not familiar with the topic, but it is not directly used in our development.

2.1 Simple Choreographies

Simple Choreographies can express finite sequences of communications between processes. Processes are identified by names ( \(\textsf{p}\), \(\textsf{q}\), etc.). Choreographies, ranged over by C, are constructed according to the following grammar.

$$\begin{aligned} C&:=\textsf{p} \mathbin {\varvec{\rightarrow }}\textsf{q};C \mid \varvec{0} \end{aligned}$$

A choreography \(\textsf{p} \mathbin {\varvec{\rightarrow }}\textsf{q};C\) represents a communication from a process \(\textsf{p}\) to a process \(\textsf{q}\) with continuation C; \(\varvec{0} \) is the terminated choreography. We omit trailing \(\varvec{0} \)s in examples.

Example 1

(Ring protocol [41]) The choreography below describes a ring protocol among three participants: \(\textsf{Alice}\) communicates to \(\textsf{Bob}\); then \(\textsf{Bob}\) communicates to \(\textsf{Carol}\); and finally \(\textsf{Carol}\) communicates back to \(\textsf{Alice}\).

$$\begin{aligned} \textsf{Alice} \mathbin {\varvec{\rightarrow }}\textsf{Bob}; \textsf{Bob} \mathbin {\varvec{\rightarrow }}\textsf{Carol}; \textsf{Carol} \mathbin {\varvec{\rightarrow }}\textsf{Alice} \end{aligned}$$
(1)

The semantics of Simple Choreographies is given as the labelled transition system induced by the rules displayed in Fig. 2. Transition labels have the form \(\textsf{p} \mathbin {\varvec{\rightarrow }}\textsf{q}\), allowing for observing the communications performed by a choreography.

Fig. 2
figure 2

Semantics of simple choreographies

Rule \(\textsc {{com}}\) models the execution of a communication at the beginning of a choreography. Rule , instead, allows for performing a transition within the continuation of a choreography, provided that the transition does not involve any of the processes in preceding instructions. This rule captures the fact that processes run independently of each other, and thus choreographic instructions can be executed out-of-order. The independence requirement is captured by the the side-condition \(\{\textsf{p}, \textsf{q}\} \mathbin {\#}\{\textsf{r}, \textsf{s}\}\), where \(\mathbin {\#}\) relates disjoint sets.

Example 2

(Ring protocol, continued [41]) Let C be the choreography in (1). Then, by rule \(\textsc {{com}}\), we have the following chain of transitions.

$$\begin{aligned} C \xrightarrow {\textsf{Alice} \mathbin {\varvec{\rightarrow }}\textsf{Bob}} \textsf{Bob} \mathbin {\varvec{\rightarrow }}\textsf{Carol}; \textsf{Carol} \mathbin {\varvec{\rightarrow }}\textsf{Alice} \xrightarrow {\textsf{Bob} \mathbin {\varvec{\rightarrow }}\textsf{Carol}} \textsf{Carol} \mathbin {\varvec{\rightarrow }}\textsf{Alice} \xrightarrow {\textsf{Carol} \mathbin {\varvec{\rightarrow }}\textsf{Alice}} \varvec{0} \end{aligned}$$

These communications cannot be executed out-of-order, because of the chain of causality between them: each instruction involves a process that needs to participate in a previous instruction.

Example 3

Consider now the choreography (inspired from the factory examples in [41]), which models a system where two “ordering” processes \(\textsf{o}_1\) and \(\textsf{o}_2\) independently communicate two respective orders to the servers \(\textsf{s}_1\) and \(\textsf{s}_2\).

$$\begin{aligned} \textsf{o}_1 \mathbin {\varvec{\rightarrow }}\textsf{s}_1; \textsf{o}_2 \mathbin {\varvec{\rightarrow }}\textsf{s}_2 \end{aligned}$$
(2)

The following derivation shows that \(\textsf{o}_2 \mathbin {\varvec{\rightarrow }}\textsf{s}_2\) can be executed first.

2.2 Simple Processes

Implementations of Simple Choreographies are modelled in a process language called Simple Processes [41]. First, we define a grammar for writing process behaviours.

$$\begin{aligned} P,Q,R&:={\textsf{p}}!;P \mid {\textsf{p}}?;P \mid \varvec{0} \end{aligned}$$

These actions are the local counterparts to the communication action in choreographies. A send action \({\textsf{p}}!\) sends a message to a process \(\textsf{p}\), and the dual receive action \({\textsf{p}}?\) receives a message from a process \(\textsf{p}\). The term \(\varvec{0} \) is the terminated process.

Processes are composed into networks (N, M, etc.), which are maps from process names to processes. We introduce some notation: \(\varvec{0} \) is the terminated network, where all process names are mapped to \(\varvec{0} \); \(\textsf{p} [P]\) is the network where \(\textsf{p}\) is mapped to P and all other process names are mapped to \(\varvec{0} \); and \(N \mathbin {\varvec{|}}M\) (“N parallel M”) is the union of N and M, assuming that their supportsFootnote 1 are disjoint. Under extensional equality of functions, the set of networks equipped with parallel composition forms a partial commutative monoid with \(\varvec{0} \) as identity element: \(N \mathbin {\varvec{|}}\varvec{0} = N\), \(N \mathbin {\varvec{|}}M = M \mathbin {\varvec{|}}N\), and \(N_1 \mathbin {\varvec{|}}(N_2 \mathbin {\varvec{|}}N_3) = (N_1 \mathbin {\varvec{|}}N_2) \mathbin {\varvec{|}}N_3\) [41].

Example 4

The following network implements the choreography in (1).

$$\begin{aligned} \textsf{Alice} [ {\textsf{Bob}}!; {\textsf{Carol}}? ] \mathbin {\varvec{|}}\textsf{Bob} [ {\textsf{Alice}}?; {\textsf{Carol}}! ] \mathbin {\varvec{|}}\textsf{Carol} [ {\textsf{Bob}}?; {\textsf{Alice}}! ] \end{aligned}$$
(3)
Fig. 3
figure 3

Semantics of simple processes

The semantics of Simple Processes is given by the transition rules in Fig. 3. Rule \(\textsc {{com}}\) synchronises processes with matching send and receive actions. Rule allows for parallel execution.

Example 5

The transitions of the choreography in (1) coincide with those of the network in (3). Technically, the labelled transition systems generated by the choreography and the network are isomorphic, showing that the network is indeed a precise implementation of the choreography.

Example 6

Out-of-order execution for choreographies corresponds to parallelism at the level of networks. The following network implements the choreography in (2).

$$\begin{aligned} \textsf{o}_1 [{\textsf{s}_1}!] \mathbin {\varvec{|}}\textsf{o}_2 [{\textsf{s}_2}!] \mathbin {\varvec{|}}\textsf{s}_1 [{\textsf{o}_1}?] \mathbin {\varvec{|}}\textsf{s}_2 [{\textsf{o}_2}?] \end{aligned}$$
(4)

Using rule and the monoidal structure of parallel composition, the network can start by executing either the communication between \(\textsf{o}_1\) and \(\textsf{s}_1\) or the one between \(\textsf{o}_2\) and \(\textsf{s}_2\).

2.3 Endpoint Projection

In general, writing correct implementations of protocols is hard, especially for more expressive choreographic languages as the one that we use later in this article. Endpoint projection (EPP) is a mechanical procedure for translating choreographies into networks by splitting choreographic terms into their local counterparts [7, 8, 16, 30, 41]. The idea is that given a choreography C and a process \(\textsf{p}\), we first compute the process term that implements the actions that \(\textsf{p}\) should perform to implement its part in C. Then, EPP is defined as the parallel composition of all such terms.

Fig. 4
figure 4

Process projection for simple choreographies

In the case of Simple Choreographies and Simple Processes, the process projection map is defined in a natural way by the recursive equations in Fig. 4. In particular, a communication term \(\textsf{p} \mathbin {\varvec{\rightarrow }}\textsf{q}; C\) is projected to a send action and the projection of the continuation if we are projecting the sender (first case), a receive action and the projection of the continuation if we are projecting the receiver, or just the projection of the continuation if we are projecting a process that is not involved in the communication.

Given a choreography C, its EPP is defined as the network . This network is a correct implementation of C.

Theorem 1

(Correctness of EPP [41]) The following statements hold for every choreography C and transition label \(\mu \) in the language of Simple Choreographies.

  • Completeness. For any \(C'\), if \(C \xrightarrow {\mu }C'\) then .

  • Soundness. For any N, if then \(C \xrightarrow {\mu }C'\) for some \(C'\) such that .

Example 7

The networks in (3) and (4) are, respectively, the EPPs of the choreographies in (1) and (2).

The completeness part of Theorem 1 is proven by case analysis on the transition performed by the choreography. This gives information about the choreography’s structure that can be used to infer the shape of the network, which in turn shows that it can perform the corresponding transition. The soundness part is by induction on the choreography, with two cases depending on whether the transition requires applying the delay rule (and invoking the induction hypothesis).

2.4 Taking Stock

Theorem 1 is used to prove other notable results given by the choreographic approach, such as deadlock-freedom. A deadlocked network is one that is not terminated but cannot make any transitions, typically because all processes are waiting for someone else. Even in a simplistic process language such as Simple Processes, we can write deadlocked networks, such as:

$$\begin{aligned} \textsf{p} [{\textsf{q}}?] \mathbin {\varvec{|}}\textsf{q} [{\textsf{p}}?]. \end{aligned}$$

Here, \(\textsf{p}\) and \(\textsf{q}\) are both waiting for each other, and therefore the network will never be able to proceed.

Since communication terms in choreographies specify simultaneously what sender and receiver processes are involved, choreographies cannot describe deadlocks, a property known as deadlock-freedom by design [7]. As a consequence of Theorem 1, the networks generated by EPP can never become deadlocked.

The choreographic language that we consider in the rest of our article is more expressive than Simple Choreographies, as it includes features that are important for modelling realistic protocols. However, the general structure of the development follows the roadmap given in this section, albeit with a much higher level of complexity.

3 Core Choreographies

We introduce Core Choreographies (CC), the choreographic language that we work with, and its formalisation. At the end of this section, we discuss how the formalisation process guided the evolution of the language from its original presentation in [16] to its present form, which is closer to the style of [41].

In CC, processes can perform point-to-point communications and have storage. Communicated messages can be either values, which are computed by evaluating local expressions, or labels (tags, or constants) from a fixed set .Footnote 2 Additionally, choreographies can include conditionals based on Boolean expressions and invoke recursive procedures.

3.1 Preliminaries

Choreographies are parameterised by a signature, which defines the types for process names (processes for short) , local variables (used to access the processes’ storage), values , expressions , Boolean expressions , and procedure names (from recursion variables). Signatures also include types for (user-defined) annotations (as discussed in Sect. 1). Since the types of expressions and values are parameters, signatures also need to specify the evaluation functions mapping expressions to values and Boolean expressions to Booleans. We fix a signature and introduce abbreviations and similarly for all other parameters for convenience.

All datatypes except the evaluation functions are equipped with a decidable equality. Since we are targetting extraction, which is not compatible with modules, we reimplemented as a record type consisting of exactly these two components, and reproved the lemmas about decidable equality from the Coq standard library. We also show that the Cartesian product of two s can be made into a , and we define a two-element decidable type whose elements are the two labels and .

Evaluation functions are again records. The first element is a function that takes an expression and a mapping from a process’s variables to values, and returns a value (possibly of a different type as the one stored locally). The second element is a proof that the value returned by evaluation does not change if the mapping from variables to values is replaced by an extensionally equivalent one.

The type models the memory state of the set of all processes.Footnote 3 We define extensional equality on states, written , and prove that it is an equivalence relation. Furthermore, we define an operation for updating the state with the assignment of value to process ’s variable , and prove a number of useful rewriting lemmas.

3.2 Syntax

Choreographies are defined inductively by the following grammar.Footnote 4

figure ad

Here, are processes, is an expression, is a variable, is a label, is an annotation, is a Boolean expression, is a procedure name, and is a list of processes.

The terms denoted \(\eta \) are called interactions; for many results, it is convenient that they form their own type. Term is a value communication, where communicates the result of evaluating to , which stores it in its local variable . Term is a label selection, where communicates label to .

Label selections are used in conjunction with conditionals. In a conditional , the evolution of the choreography is determined by the outcome of evaluating the Boolean expression at . Other processes that need to know which branch was chosen (knowledge of choice [9]) can get this information through the reception of label or from .

Interactions are paired with annotations ( ), which are ignored by the semantics. They are meant to include additional information that may be needed in subsequent processing steps, such as documentation or the second-stage compilation mentioned in Sect. 1. We omit annotations in all our examples.

Term invokes the procedure named . A procedure may involve several processes, and the semantics of CC allows each process to join the procedure only when needed. The runtime term represents this intermediate situation: execution of procedure has already evolved to , but the processes in have not yet joined it. Runtime terms are not meant to be written by programmers: they are auxiliary terms generated by the semantics.

The grammar of choreographies is defined as the following inductive types.

figure bi

A set of procedure definitions, formalised as type , is a mapping assigning to each a list of processes and a choreography; intuitively, the list contains the processes that are used in the procedure. A is a pair containing a set of procedure definitions and the choreography to be executed at the start, also called the main choreography.

figure bm

We write and for, respectively, the set of procedure definitions and the main choreography in a program (so and are simply aliases for the corresponding projections). Likewise, and denote the list of processes and the definition of a particular procedure within . Finally, is the function mapping each variable to the set of processes that it uses according to .

Example 8

(Distributed Authentication) The choreography below describes a multiparty authentication scenario where an identity provider authenticates a client to server . (For convenience, we name some of the subterms in the choreography.)

figure cd

starts with communicating its to , which stores them in . Then, checks whether the received credentials are valid by evaluating the Boolean expression , and signals the result to and by selecting when the credentials are valid ( ) and otherwise ( ). In the first case, the server communicates a to , otherwise the choreography simply terminates.

The selections from to and address knowledge of choice, as previously described.

Well-formedness. There are a number of well-formedness requirements on choreographies, which can be grouped in three categories.

  1. 1.

    Intended use of choreographies. Interactions must have distinct processes (there are no self-communications), e.g., is disallowed.

  2. 2.

    Intended use of runtime terms. Procedure definitions may not contain runtime terms. may include subterms , but must be nonempty and include only process names that occur in .

  3. 3.

    Design choices in the formalisation. The processes in include all processes that are used in .

Well-formedness is essential in the proof of correctness of EPP (Sect. 6).

We start by formalising the different properties of choreographies separately:

  • holds if does not contain runtime terms ( );

  • holds if contains no self-communications;

  • holds if all runtime terms in have nonempty lists of process names.

These properties are defined recursively over in the natural way. Well-formedness of choreographies is defined as the conjunction of the last two properties.

Well-formedness of programs also takes into account the additional requirements on the lists of processes annotating runtime terms. Specifically, in a program , the choreography must be consistently annotated with respect to : in any subterm in , the list only contains processes appearing in . This property is written as , where predicate

figure du

is defined inductively in the expected way. Also, the set of procedure definitions in must be well-annotated: if , then the set of processes used in must be a nonempty subset of ,Footnote 5

figure ed

The last definition uses function , which computes the set of processes occurring in a choreography, given the set of processes used in each procedure. It generalises to , which computes the set of processes occurring in a well-annotated program.

Using these ingredients, we define well-formedness of programs as follows.

figure eg

Since choreographies do not include runtime terms, this definition also implies that all procedure definitions are well-formed.

Example 9

Let map to the pair consisting of the process list and the following choreography.

figure el

describes a file transfer protocol between a server and a client using Cyclic Redundancy Checks ( ) to detect errors from a noisy channel.

Assuming that maps all other procedure definitions to , the program satisfies .

Recall that our long-term future goal is to apply program extraction to this formalisation, and then use the result in tools. Many of the results that we show later only hold for well-formed programs, and any tool built on our theory should be able to validate that its input is well-formed. However, due to the quantification over all procedure names, well-formedness of programs is in general not decidable. In practice, though, choreographic programs only use a finite number of procedures; if these are known, well-formedness becomes decidable.

This observation motivates the definition of a recursive predicate

figure eu

such that holds iff only calls procedures in (directly). This is generalised to programs by requiring that all procedures in also satisfy the same property, and additionally that all procedures not in be defined as .

figure fb

The requirement for procedures not in is included to ensure well-formedness. From this, we can prove decidability of well-formedness.

figure fe

Applying this lemma in extracted code requires knowing a suitable set . While we cannot automatically verify that this set satisfies , it is very reasonable to trust that a correct one has been provided: typically, the relevant procedures used in a program are written down explicitly, making it straightforward to list them.

An alternative approach would be requiring the set of procedure names to be finite. This is closer in spirit to the pen-and-paper presentations of choreographic languages—even if procedure names are taken from an infinite set, only a finite number of them can be used in a concrete program [16]. We chose the present approach for simplicity, as working with finite sets in Coq is notoriously cumbersome.

3.3 Semantics

The semantics of CC is defined by means of labelled transition systems, in three layers. At the lowest layer, we define the transitions that a choreography can make ( ), parameterised by a set of procedure definitions; then we pack these transitions into the more usual presentation—as a labelled relation on configurations (pairs program/state). Finally, we define multi-step transitions as the transitive and reflexive closure of the transition relation. This layered approach makes proofs about transitions cleaner, allowing us to separate the different levels of induction.

Transition labels. Each layer of the semantics has its type of transition labels. For the lower level, we define an inductive type whose constructors reflect the possible actions a choreography can take: value communications, label selections, reducing a conditional, or locally joining a procedure call.

The second layer uses the type of labels corresponding to the observable actions. These types are related . Labels in the third layer are simply lists of s.

figure fo

Pen-and-paper presentations only include s, which capture what can be observed in transitions without revealing syntactic information about the choreography. However, in Coq, this information is needed to obtain induction hypotheses that are strong enough for our development, which is why we have introduced s.

The transition relations are defined inductively by the rules in Figs. 5, 6, 7. For readability, we present them in a more standard rule notation – below, we exemplify how they correspond to constructors in the formalisation. We also introduce suggestive notations for all these relations: stands for (this relation is parameterised by for dealing with procedure calls); stands for , where are pairs containing a and a ; and stands for .

Fig. 5
figure 5

Semantics of choreographies, lower layer ( )

Fig. 6
figure 6

Semantics of choreographies, middle layer ( )

Fig. 7
figure 7

Semantics of choreographies, top layer ( )

The rules defining can be divided into three groups, which we describe in the following paragraphs.

Transition rules. Rules , , and deal with execution of the first action in a choreography.

As an example, rule  corresponds to a constructor

figure gk

Including the requirement instead of simply writing in the conclusion is essential for enabling transitions between different intensional representations of the same state, which occur in practice. In particular, confluence (discussed below) does not hold without this formulation. The corresponding more compact rules are proved as lemmas, e.g.,

figure gn

These formulations can be useful in proofs that use existential tactics to infer a previously uninstantiated target of a transition.

Procedure calls. Rules , , and allow a process to enter a procedure call, with different cases according to whether other processes have already entered the procedure and/or whether there are any other processes that still have to join it.

A procedure call is expanded when the first process joins it (rule ). The remaining processes and the procedure’s definition are stored in a runtime term, from which we can observe transitions either by more processes entering the procedure (rule ) or by out-of-order execution of internal transitions of the procedure (rule , discussed below). When the last process enters the procedure, the runtime term is consumed (rule ). Rule addresses the edge case of a procedure that only uses one process.

Out-of-order execution. Rules , and deal with out-of-order execution (cf. Example 3). These rules require that the processes involved in the transition do not appear in the first term in the choreography; these conditions are specified by auxiliary predicates defined straightforwardly.

Example 10

Consider the program where is the choreography in Example 8 and is arbitrary (there are no recursive calls in ).

figure he

where is the evaluation of at in according to the evaluation function , and . If evaluates to at in , then execution continues as follows.

figure hp

where and .

If the check fails, the choreography instead continues as follows.

figure hs

In the compound transitions in the examples above, the actions in the label are executed in order.

Example 11

Let be as in Example 9 and be the body of . Consider the program . The processes in the procedure can join it in any order as exemplified by the transitions below.

figure hy
figure hz

The state is immaterial.

We prove a number of useful low-level properties about transitions. For example, we show that transitions are preserved by state equivalence.

figure ib

This result generalises to and . Likewise, we show that: the set of processes involved in a choreography cannot increase during execution; transitions preserve well-formedness and the set of procedure definitions; well-formed choreographies do not perform self-communications; and terminated choreographies cannot perform transitions.

3.4 Progress, Determinism, and Confluence

The challenging part of formalising CC is establishing the core properties of the language semantics, which are essential for more advanced results and not always proven in full detail in pen-and-paper publications. We discuss some of the issues encountered, as these were also the driving force behind the changes relative to [16].

The first key property of choreographies is that they are deadlock-free by design: any choreography that is not terminated can execute.

This is proved by doing case analysis on the choreography and invoking the rule consuming its first action. Since the only terminated choreography in CC is , this property also implies that any choreography either eventually reaches the terminated choreography or runs infinitely.

figure ig

The second property of our semantics is that it is deterministic, in the sense that transitions can be uniquely inferred from their label or the resulting state. These properties are essential for later results, and the need for them was the original motivation for introducing type —the first group of results does not hold if s are used in the definition of .

figure ik

The third key property is confluence, which has some relevant implications for our calculus: if a choreography has two different transition paths, then these paths either end at the same configuration, or both resulting configurations can reach the same one. This is proved by first showing the diamond property for choreography transitions (considering all possible combinations of independent transitions), then lifting it to one-step transitions, and finally applying induction to show it for multistep transitions.

figure il

As an important consequence, we get that any two executions of a choreography that end in a terminated choreography must finish in the same state.

figure im

Using these results, we can establish Turing completeness of CC. The structure of the proof closely follows that of [16], and has been described in [21]. We briefly summarise it for completeness of the presentation.

First, we formalise Kleene’s partial recursive functions [34] as an inductive type in Coq. Since all functions in Coq are total, this definition only establishes syntax for them. We define an evaluation function separately that takes a partial recursive function f, an input \(\vec {n}\) and a number of steps k, and performs k steps of the computation of \(f(\vec {n})\)—where e.g. base functions evaluate to their value in one step, while unfolding a composition or performing one recursive call takes one step. This allows us to define convergence to a value (the computation finishes in a finite number of steps) and divergence (the computation does not finish in any number of steps).

Next, we define a mapping from partial recursive functions to choreographies, and show that there is a correspondence between the evaluation function defined above and the execution of the choreography. In particular, given a function f and an input \(\vec {n}\), if \(f(\vec {n})\) converges to a value, then executing the choreography obtained from f from an appropriate state storing the values \(\vec {n}\) terminates in a state where a particular process stores the result; if \(f(\vec {n})\) is undefined, then execution of the choreography never terminates. The converse implications also hold. The formal proof relies essentially on the definition of the semantics of choreographies and confluence results. The interested reader is referred to the works cited above for additional details.

3.5 Discussion

Formalising the proof of confluence following [16] turned out to be a spiralling process: the pen-and-paper proof assumes some obvious properties, which were not proved; proving these required some additional lower-level lemmas; these in turn generated some even more specific lemmas; and so on. At some point, we realised that the auxiliary lemmas accumulated already accounted for 90% of the formalisation. Worse, these lemmas were extremely specific and detached from the contents of [16]—even though we were far from done. This led us to rethinking the design of CC, and eventually to adopting the language of [41].

In this section, we discuss the features of the original language that turned out to be problematic. These regarded the handling of procedure definitions (syntax) and the treatment of procedure calls and out-of-order execution (semantics).

Syntax. Procedures were initially defined by including a term in the grammar defining choreographies. While this removed the need for a separate notion of program, it introduced several dimensions of complexity. Even the notion of terminated choreography was nontrivial, since could occur arbitrarily deep inside some of these terms. This made it hard to ensure that the Coq definition was an adequate representation of the informal notion in [16], affecting all results regarding termination, progress, and deadlock-freedom. With the current syntax, terminated programs are exactly those whose for which .

Additionally, the name in acts as a binder, which added all the usual problems of working with binders—in particular, having to deal with capture-avoiding substitutions and \(\alpha \)-renaming. In the current language, procedure names are statically determined and fixed, so there is no need to rename them ever, and they can be treated as constants. This constructor also allowed for unintuitive choreographies, e.g., where the choreography itself contains additional procedure definitions.

Pairing procedure definitions with choreographies in programs yields a cleaner theory, and the overhead of an additional layer is a very small price to pay for the simplicity gained. This approach had been proposed earlier [13], and the two formulations are argued to be equally expressive in [18].

Semantics. Instead of a labelled transition system, the semantics of [16] was a reduction semantics that used a structural precongruence relation to model out-of-order execution and to unfold procedure definitions.

To understand this issue, consider again Example 3, which shows a choreography that has two possible initial transitions. In a framework with reductions and structural precongruence, the out-of-order transition is obtained by first rewriting the choreography as \(\textsf{o}_2 \mathbin {\varvec{\rightarrow }}\textsf{s}_2; \textsf{o}_1 \mathbin {\varvec{\rightarrow }}\textsf{s}_1\) and then applying rule \(\textsc {{com}}\). The set of legal rewritings is formally defined by the structural precongruence relation \(\preceq \), and there is a rule in the semantics that closes the transition relation under it.

In any proofs about the semantics, an approach using structural precongruence needs to take into account all the possible ways into which choreographies may be rewritten in a reduction. Concretely, in the proof of confluence, where there are two reductions, there are four possible places where choreographies are rewritten; given the high number of rules defining structural precongruence, this led to an explosion of the number of cases. Furthermore, induction hypotheses typically were not strong enough, requiring us to resort to complicated auxiliary notions such as explicitly measuring the size of the derivation of transitions, and proving that rewritings could be normalised. This process led to a seemingly ever-growing number of auxiliary lemmas that needed to be proved, with no counterpart in the original reference [16], and after several months of work with little progress it became evident that the problem lay in the formalism.

Summary. The current proof of confluence takes about 300 lines of Coq code, including a total of 11 lemmas. This is in stark contrast with the previous attempt, which while still unfinished already included over 30 lemmas with extremely long proofs.

With the current definitions, the theory of CC is formalised in two files. The first file, which defines the preliminaries, contains 24 definitions, 60 lemmas and around 740 lines of code. The second file, defining the syntax and semantics of CC and proving properties about it (including all the ones described herein), contains 32 definitions, 126 lemmas, 2 theorems and around 2300 lines of code.

The formalisation of partial recursive functions contains 22 definitions and 84 lemmas, with a total of 1464 lines of code. The proof of Turing completeness consists of 28 definitions and 65 lemmas, with a total of 2371 lines of code.

4 The Process Language

The second part of our formalisation concerns the process calculus that we use for implementing CC: Stateful Processes (SP). We follow the pen-and-paper design presented in [41]. SP is used to define networks of processes running in parallel, each with its own behaviour, that can interact by direct messaging.

4.1 Syntax

The syntax of SP is structured in three layers: behaviours, which express the local actions performed by individual processes; networks, which combine processes in a system where they can interact; and programs, which pair a network with a set of procedure definitions (which all processes can call). As with CC, we assume an underlying signature.

The constructors for behaviours correspond to those for choreographies, but interactions are now split between the two different roles involved (sender and receiver). The type is defined inductively from the grammar below.

figure iv

Conditionals, procedure calls, and the terminated behaviour are standard and similar to the corresponding constructs in CC.

A term represents a send action towards , where is the expression used to compute the value to be sent and is an annotation. Dually, a term represents a receive action where a value received from is stored in the local variable ( is, again, an annotation).

A selection action is similar to a send action (label is sent to ). The dual action needs to offer a behaviour for , but may also accept other labels. In pen-and-paper presentations, these branching terms are typically defined as partial functions from labels to behaviours.

Formalising this informal description is challenging. A natural choice would be to include a constructor . However, this is problematic for defining EPP, which relies on a recursively defined function on pairs of behaviours called merging (cf. Sect. 5.1). Defining this function directly in Coq is unwieldy because of the complexity of writing the appropriate term of type given the corresponding subterms from the arguments.

Using partial functions also seems like an overkill, considering that there are only two possible labels. Instead, we include a constructor

figure jk

that registers explicitly the behaviours offered for each of the two possible labels, in order. This design choice avoids the aforementioned issues, at the cost of making our development harder to generalise to larger sets of labels in the future.

Because of the option types in , the induction principles generated automatically for are not strong enough (they do not include induction hypotheses over the s appearing within branching terms). To overcome this, we define an auxiliary function measuring the depth of the AST corresponding to a , use it to prove the expected general induction principle, and define a tactic that applies it.

Networks. Networks are simply (total) functions from processes to behaviours.

figure jr

We define extensional equality of networks in the expected way and show that it is an equivalence relation. We support the common notation for writing networks by including a function for constructing singleton networks , a parallel composition operator , and a removal operator (recall the description in Sect. 2).

For simplicity, we do not require disjoint support in parallel composition: if both networks define a nonterminated behaviour for , the result of and is different. Although this may seem odd, it has the advantage of making parallel composition total. We show that parallel composition is commutative under the assumption that the two composed networks have disjoint supports.

figure jz

Our library includes a number of results to reason about the network operations, including very specific lemmas dealing with networks that appear in the rules defining the semantics of SP, e.g., that updating the behaviours of two distinct processes yields the same result independent of the order of the updates.

Programs and well-formedness. As before, a program is a pair consisting of a set of procedure definitions and a network.

figure ka

Well-formedness is significantly simpler than for choreographies. If , then is well-formed, , as long as no process in attempts to communicate with itself. is well-formed, , if all processes are mapped to well-formed behaviours. This is not decidable in general, but it is under the assumption that all processes outside a given set are mapped to —an assumption that holds for all networks that can be written explicitly using parallel composition of singleton networks.

Well-formedness of programs does not make sense: well-formedness of a behaviour depends on who is executing it, but a procedure definition has no information about which processes will call it.

Example 12

Consider the network , where:

figure kk

This network implements the choreography in Example 8.

4.2 Semantics

The semantics of SP is again defined by a labelled transition system. Transitions for communications match dual actions in two processes, while conditionals and procedure calls simply run locally. There are again three layers of definitions, which are shown in Fig. 8, 9, 10, and two types of transition labels (as in CC). Transitions support suggestive notations: for , for , and for .

Fig. 8
figure 8

Semantics of networks, bottom layer ( )

Fig. 9
figure 9

Semantics of networks, middle layer ( )

Fig. 10
figure 10

Semantics of networks, top layer ( )

These definitions warrant similar observations as those for the semantics of CC. Transitions include premises on network equality and state equality, rather than requiring specific values. We include some lemmas stating the more restricted rules, both as a sanity check and because they can be useful to instantiate variables created by the use of existential tactics in proofs.

figure ku

There are two rules for reducing selections, one for each label. This is a deviation for standard practice (where there is a single rule and a premise matching the label in both behaviours) stemming from our design choice of avoiding functions in branching terms. Having an extra rule generates additional cases in induction proofs, but this formulation effectively simplifies the formalisation by eliminating one layer of inversion.

Example 13

We illustrate the possible transitions of the network from Example 12. We abbreviate the behaviours of processes that do not change in a reduction to to make it clearer what parts of the network are changed. Furthermore, we omit trailing s in s.

The network starts by performing the transition

figure ky

where and are as in Example 10.

If , execution continues as

figure lc

where and are again as in Example 10. Otherwise, it continues as follows.

figure lf

The labels in these reductions are exactly as in Example 10.

4.3 Determinism and Confluence

As for CC, we prove a number of useful results about the semantics of SP. These can be roughly divided in two groups: results showing that reductions are stable under the extensional equalities on the different types involved, and properties on the actual transitions. While the results in the first category are not surprising, they are useful and show that the definitions make sense.

figure lg

While determinism and confluence are similar to the corresponding results to CC, they are not as interesting: for networks generated by EPP (which are the ones we are interested in), these results would follow by the same properties for choreographies.

The formalisation of SP consists of 25 definitions, 81 lemmas, 11 simple tactics, and approximately 1960 lines of Coq code.

5 Endpoint Projection

As with the simple language from Sect. 2, the intuition for generating process implementations is that each choreographic action should be projected to the corresponding process action. The prototypical example is the value communication , which should be projected to a send action for , to a receive action for , and skipped for any other processes.

In the presence of conditionals, this intuition is not enough. Projecting a conditional for any process other than , say , is nontrivial, because has no way of knowing which branch should be executed. Therefore ’s behaviour must combine the projections obtained for and .

This problem is commonly known as knowledge of choice, and one of the solutions relies on the usage of label selections [8, 9]. If ’s behaviour should depend on the result of ’s local evaluation, then the result of this evaluation should be communicated to by means of a label selection. The two possible behaviours can then be combined in a branching term offering two different options.

5.1 Merge

A standard way of combining behaviours to solve the problem above is the merge operator [8]: a partial binary operator that returns a behaviour combining all possible executions of its arguments (if possible). In SP, two behaviours can be merged only if they are built from the same constructor with matching parameters. So if can be merged with to yield , we can also merge with to obtain , but can never be merged with for (different arguments) or with (different constructor).

The only exception is branching terms, where merge can combine offers on different labels. For example, merging with yields . In this way, the prototypical choreographic conditional can be projected for as .

The partiality of merge again poses a formalisation problem. Our original approach [20] defined an auxiliary type that extends the syntax of behaviours with a constructor . In this work, instead, we define a ternary relation .Footnote 6 While this design requires two additional lemmas stating that this relation is functional and computable, it significantly simplified this part of the formalisation (both in size and complexity of the proofs). As an example, [20] reported a number of inversion results, e.g., if merging two behaviours yields a behaviour starting with a send action, then both arguments start with that same action. All these results can now be obtained directly by applying inversion on the relevant hypotheses.

The full definition of includes 22 clauses. Figure 11 lists all representative cases; the missing clauses deal with the remaining combinations of subterms in branching terms (see Sect. 5.5 for a discussion on the exponential dependency of the number of clauses on the number of labels, and on the problems with formalising the more general definition from the literature). We also define the suggestive notation for , which reminds us that is a partial function.

Fig. 11
figure 11

Definition of the merge relation

We show that merge is functional, decidable, and preserves well-formedness.

All proofs are simple using induction on behaviours and inversion on the hypotheses on .

figure mw

Decidability is formulated using the stronger existential quantifier so that we can also obtain the existential witness to use in further definitions.

5.2 Branching Order

In the literature, the arguments of merge and its result, when defined, are in a relation known as the branching order [8, 41]. This is formalised as yet another inductive type, defined by the rules in Fig. 12.Footnote 7 We call the relation , for which we define the infix notation .

Fig. 12
figure 12

Definition of the branching order

The branching order is reflexive, transitive and antisymmetric. It is pointwise extended to networks by defining where is infix notation for . This relation is again reflexive, transitive and antisymmetric (with respect to extensional equality).

More interestingly, adding branches to some behaviours in a network does not eliminate any transitions that the network can do.

figure nd

(The quantification on makes this lemma easier to apply.)

We can now justify the notation for : it is the partial join for the branching order, in the sense that if two behaviours have an upper bound, then they are mergeable and their merging is their least upper bound.

figure ng

Another key result is that the branching order is stable under merging:

figure nh

As we will see, this result is essential for the cases of the EPP theorem dealing with conditionals.

Lastly, we prove the algebraic properties of – idempotency, commutativity, and associativity—by exploiting the relationship between and the branching order.

All proofs in this section are again simple induction arguments using inversion on the hypotheses on and .

5.3 Projection

We can now define the projection of a choreography for an individual process. Since this definition relies on for the case of the conditionals, it is also a partial function. We define it inductively as a relation

figure nn

and abbreviate to .

The type of also reveals a new feature of projection when compared to the simple language from Sect. 2: the signature for the target instance of SP is different than that of the source instance of CC. The reason for this lies in the presence of procedure definitions: each procedure yields several projected procedures, one for each process in the choreography.Footnote 8 The type of procedure names in the target of projection is thus ; this can be seen in the rule for projecting procedure calls, which is included with the remaining rules in Fig. 13.

Fig. 13
figure 13

Rules for projecting a choreography for a given target process. The notations are the ones printed by Coq, but they are not parsable due to the the different signatures

We show that is functional and decidable, and that it returns well-formed behaviours for choreographies without self-communications.

From we obtain several notions of projectability: relative to a process or a set of processes, and projectability of —which requires each procedure to be projectable relative to its set of used processes.

These notions complement well-formedness, as being well-formed is not enough to be projectable—well-formedness does not ensure knowledge of choice.

figure nw

A program is projectable if the main choreography is projectable for all its processes and the set of procedure definitions is projectable.

figure nx

Finally, we want to compute projections, which are again partial functions. Since our ultimate goal is to extract a correct implementation of EPP, we need to take a different approach to partiality and define

figure ny

taking a proof term as additional argument (for which we prove proof irrelevance). These definitions are interactive, so we also state and prove lemmas showing that they yield the expected results as in pen-and-paper presentations [16, 41].

figure nz

Paving the way for the EPP theorem, we prove a number of inversion lemmas for EPP, which cannot be trivially obtained by applying inversion to a hypothesis.

The proofs follow the structure of the interactive definition, possibly combined with induction on the choreography.

figure oa

5.4 Strong Projectability

The operational correspondence between choreographies and their projections, which is the topic of Sect. 6, states that a projectable choreography can make a transition iff its projection can make a corresponding transition. Generalising this result to multi-step transitions requires chaining applications of this correspondence. However, projectability is not preserved by transitions, due to how runtime terms are projected: is projected as if is in , and as the projection of otherwise. Our definition of projectability allows to be unprojectable for any process in , which would make the result of the latter transition unprojectable.

This situation can never arise if one respects the intended usage of runtime terms: initially is the body of a procedure, and is the set of processes used in it. Afterwards only shrinks, while may change due to execution of actions that involve processes not in (which keeps projectable). This assumption is implicit in pen-and-paper presentations. We formalise it in the following definition of strong projectability.

figure oo

The last conjunct in the case of conditional is needed to guarantee that strong projectability implies projectability. The last conjunct in the case of runtime terms captures the notion that may differ from the original definition of procedure , but the transitions in the reduction path did not involve processes that still have to execute the procedure call.

Projectability and strong projectability coincide for initial choreographies. Furthermore, we state and prove lemmas that show that implies both and for any choreography that can transition to. (This is the reason for including the last conjunct in the clause defining strong projectability of conditionals: without it, we still would not be able to prove that .)

Strong projectability for programs requires as expected that all choreographies in the program be strongly projectable. Furthermore, we also require the program to be well-formed. This assumption makes the definition simpler and more manageable, as all procedures will be initial and annotated with the right sets of processes.

figure ox

Using these results, we can start relating the semantics of choreographies with the definition of EPP. For example, if can execute a communication from to , then the behaviour of its projection for starts by sending the corresponding expression to , while ’s behaviour starts by receiving a value from .

figure pf

An interesting corner case is what happens for processes not involved in the transition: they may lose some subbehaviours in branching terms due to some branches of conditionals disappearing from the choreography.

figure pg

The first hypothesis states that the procedures in are well-annotated.

All these proofs use induction on the choreography. As a consequence of these lemmas, we get that strong projectability is preserved by transitions.

figure pi

The hypotheses of the first lemma all hold if is a strongly projectable program.

5.5 Discussion

Modelling of partial functions. The definition of very explicitly considers the \(2^4=16\) possible combinations of behaviours that can be offered when both arguments are branching terms. This clearly does not scale if the set of labels is larger, and it is the place where our design choice of fixing it to a two-element set is most critical. The same issue, but with a smaller impact, arises in the definition of , which includes \(2^2=4\) clauses related to branching terms, and in the definition of , which includes one clause for each branching term.

We do not think that this issue can be circumvented. Our original approach considered an unspecified type , and branching terms had type

figure po

This definition quickly proved unusable in practice: the induction principles generated by Coq were too weak, and most datatypes related to the process calculus had an undecidable equality. Furthermore we ran into problems with definitions that required inspecting the behaviours associated to the labels because of the size restrictions in elimination combinators.

The first time we managed to have a working definition of was after fixing the set of labels to contain two elements. This was the approach presented in [20], where is formalised by first defining a total function

figure pr

(where is a type including subterms with the obvious intended meaning), and then defining as , where is the trivial injection from to .

Apart from the added complexity of having duplications of types and definitions throughout the formalisation, working with these functions is very cumbersome. The definition of relies heavily on deciding equalities, so proofs of results about necessarily had to perform the same eliminations. At the end of the day, the number of cases in proofs was in the same order of magnitude as in the current version—but they were generated in several verbose elimination steps, rather than directly from performing induction/inversion on a hypothesis. Furthermore, the old definition required us to consider a significant number of absurd cases (in some lemmas, around \(90\%\) of the total), whereas with the current definition these cases are simply not generated. The only added complexity we noticed while adapting the formalisation was that we occasionally needed to apply lemma to infer that two behaviours are identical—but the size of this part of the formalisation was reduced by about \(80\%\) (from around 3150 lines down to 700 lines).

Taking all these aspects into account, we believe that the current design choices are the best possible compromise at this stage between the full generality given by including an unrestricted set of labels and the benefits of having a fully formalised theory.

Projectability. The lemmas relating projectability to the low-level semantics of choreographies typically include several hypotheses, cf. lemma . For programs, we packaged these properties in a single definition ( ). For the lower-level lemmas, we decided against this because not all these properties are needed in all lemmas—some are only required in results involving procedure calls, others are important for conditionals, and communications require far fewer. By including only the necessary assumptions in each lemma, we obtain more robust results.

Strong projectability. The need for strong projectability was independently identified in the pen-and-paper presentation in [41]. There, the projectability requirement on runtime terms was included in the notion of well-formedness for choreographies. While this option matches the intuition of “intended usage of runtime terms”, it requires having defined projection. In our formalisation, we strive for modularity, and we opted for a design where the choreographic calculus is fully decoupled from the target language and the definition of projection. In this way, we allow for future extensions of our development with alternative definitions of EPP.

In the future, it would be interesting to investigate whether there is a syntactic characterisation of “intended usage of runtime terms” that is completely at the level of choreographies. Such a characterisation would yield the benefits of both approaches described above: it would give us a notion of well-formedness closer to intuition, while keeping it decoupled from EPP.

Summary. The definitions of branching order, merge and Endpoint Projection, together with the accompanying lemmas, are divided in three files totaling 14 definitions, 126 lemmas and 15 tactics to automate recurring types of goals. By far the largest bulk is the formalisation of EPP, at over 2200 lines of Coq code (with approximately 100 lemmas), while the branching order and merge require respectively 260 and 440 lines of Coq code (with a total of only 3 results that require longer proofs).

6 The EPP Theorem

The operational correspondence between choreographies and their projections, in languages that include both conditionals and out-of-order execution, is not as straightforward as for the simple language in Sect. 2. In particular, branching terms in networks may linger for a bit longer compared to the choreographies that generated them. This requires referring to the branching order in the EPP theorem:

figure qe

(Recall that takes a proof of projectability as its last argument.)

Completeness is not too hard to prove. As in [16, 41], the result is proven by considering the possible transitions that can make; there are four cases, and the results proved earlier about the shape of the projection of suffice to establish the thesis without too much work. The whole proof is 250 lines long, and the generalisation to multi-step transitions requires an additional 40 lines.

The proof of soundness is known to be harder [8, 16, 40, 41]. A common strategy is to proceed by induction on the choreography, and then do case analysis on the possible network transition. The latter is either the first term in the choreography, and we can apply the matching choreography rule; or it is not, and we can apply a delay rule and invoke the induction hypothesis.

Each of these cases is challenging in itself, and they are therefore stated as separate lemmas on transitions. As an example, the transition lemma for communications reads

figure qi

and the corresponding proof script is around 320 lines long. There are five of these lemmas in total, of a similar level of complexity.

Soundness also requires an additional lemma on procedure calls:

figure qj

which is needed to apply the corresponding transition lemma.

Chaining applications of also requires that extending the projection of a choreography with extra branches does not add transitions.

figure ql

This result is lifted to and . The latter generalisation requires applying . It is then itself used to prove soundness of EPP for multi-step transitions.

The proof of the EPP theorem consists of an additional 2650 lines of Coq code, for only 14 lemmas.

7 Related Work

The need for formalising concurrency theory is identified in [39], where the authors formalised a published article on a process calculus in Coq and discovered several major flaws in the proofs. The authors

[...] feel that it is [the errors’] very presence in a peer-reviewed, state-of-the-art paper that strongly underlines the need for a more precise formal treatment of proofs in this domain. [39, Sect. 6]

Since then, there have been a number of formalisation efforts in this area. We discuss the ones closest to our work.

To the best of our knowledge, our original presentations [20, 21] were the first formalisations of a choreographic language featuring the expected programming constructs that allow for infinite and branching concurrent behaviour. As we discussed, this article presents a substantial improvement of the original development.

More recently, there have been two additions to the family of fully-formalised choreographic programming languages.

Kalas is a certified compiler written in HOL from a choreographic language similar to ours to CakeML [44].

As in our development, the set of selection labels in Kalas is restricted to two. Kalas includes an asynchronous semantics, while ours is synchronous, but the notion of EPP is more restrictive than ours: it is an ad-hoc definition that bypasses the need for the merge operator, but does not provide its full flexibility. In particular, processes evaluating conditionals must immediately send selections to the processes that need them, while CC is more faithful to the pen-and-paper literature on choreographies [7, 8, 30].

Example 14

To illustrate the flexibility of our projection, consider the following enhanced version of our distributed authentication choreography from Example 8, where now immediately communicates whether the authentication attempt was successful to a .

figure qr

This choreography is projectable in our framework but not in Kalas, because performs does not immediately engage in selections in the branches of the conditional. However, doing so would require postponing the logging action, which might be important to do right away because of non-functional requirements.

Pirouette is a functional choreographic programming language formalised in Coq [29]. It supports asynchronous communication and higher-order functions, but at the cost of introducing hidden global synchronisations for all processes whenever a function is called. The semantics of CC is, instead, decentralised and all synchronisations are syntactically explicit—but again it only has synchronous semantics and no higher-order features.

Extending CC with asynchronous communication has also been studied [12], but since it was not part of the reference pen-and-paper work that we followed, we postponed its formalisation to future work.

Another line of research connected to choreographic programming is that of multiparty session types [30]. These types are essentially choreographies without computation (e.g., communications only specify sender, receiver, and message type, but not how the message is computed or where it is stored), and are therefore simpler than CC. There are two available formalisations of multiparty session types [10, 33]. Both formalisations include a counterpart to the EPP theorem, but they are even more restrictive than Kalas in how they handle the projection of conditionals.

8 Conclusion

We presented a formalisation of a state-of-the-art article on theory of choreographic programming. The formalisation process unveiled subtle problems in definitions, making a case for a more systematic use of theorem provers to validate results in the field. Even more, it positively impacted the theory itself, showing that formalisation can be valuable tool also in the design phase of the research process.

Our formalisation was done in parallel with the pen-and-paper revision of CC carried out in [41]. There are two interesting observations to make about this parallel development. First, many of the technical aspects that we discuss in this article were also independently discovered during the writing of [41]. Second, the seemingly disparate goals of making the theory more intuitive to students and amenable to formalisation actually converged on the same solution, and sometimes resulted in useful exchanges of feedback. Taken together, these two observations strongly suggest that the current formulation of CC is the “right” one, and offers a suitable basis for future developments.

Our work also provides some valuable lessons about formalising semantics of concurrent systems. While choosing between a reduction semantics with a structural precongruence for dealing with out-of-order execution or a transition semantics based on a labelled transition system was mostly a matter of taste in pen-and-paper presentations, the latter approach is clearly preferable from a formalisation point of view. Since it does not require syntactic manipulation of choreographies for modelling transitions, the derivations corresponding to execution steps are shorter and do not include potential redundancy, which makes it easier to reason about them and to find appropriate induction hypotheses.

Our formalisation further benefits from the design choice of defining all procedures at the top level, which allows us to bypass all the complexity of having to work explicitly with binders and substitution.

We have already started exploring extensions and applications of our formalisation. These include amendment (a procedure that injects appropriate selections to make a choreography projectable), a proof of starvation-freedom, alternative definitions of EPP, and applying program extraction to develop a certified toolchain from choreographies to executable code.

An important tool for future extensions is stronger automation for proofs about choreographies. Our development already includes a few simple tactics that deal with commonly-recurring goals, but it would be worthwhile to extend this library with more powerful tactics, e.g., to reason about multi-step transitions. Furthermore, for many proofs by structural induction, there are strong similarities among their different cases, and it would be interesting to try to automate proof strategies that can capitalise on this.

The appeal of choreographic programming largely depends on its promise of delivering correct implementations, by removing the possibility of human error through EPP. This promise has motivated a proliferation of choreographic programming languages, including features of practical value such as asynchronous communication, nondeterminism, broadcast, dynamic network topologies, and more [2, 26, 31, 41]. The theories of these languages are becoming more and more complex, thus increasing the likelihood of critical mistakes and making the case for more trustworthy developments. We hope that our work can contribute a solid foundation for the development of these features.