Service Oriented Computing and Applications

, Volume 9, Issue 3–4, pp 269–284 | Cite as

Pabble: parameterised Scribble

Open Access
Special Issue Paper

Abstract

Many parallel and distributed message-passing programs are written in a parametric way over available resources, in particular the number of nodes and their topologies, so that a single parallel program can scale over different environments. This article presents a parameterised protocol description language, Pabble, which can guarantee safety and progress in a large class of practical, complex parameterised message-passing programs through static checking. Pabble can describe an overall interaction topology, using a concise and expressive notation, designed for a variable number of participants arranged in multiple dimensions. These parameterised protocols in turn automatically generate local protocols for type checking parameterised MPI programs for communication safety and deadlock freedom. In spite of undecidability of endpoint projection and type checking in the underlying parameterised session type theory, our method guarantees the termination of end point projection and type checking.

Keywords

Multiparty session types Communication safety Deadlock freedom Pabble Protocol description language Dependent types  

1 Introduction

Message-passing is becoming a dominant programming model, as witnessed in application programs from high-performance computing scaling over thousands of cores or cloud-based scalable backends of popular web services. These are environments where services are dynamically provided, through choreography of interactions among numerous distributed components. Assuring safety of concurrent software in these environments is a vital concern: Many message-passing libraries, programs and systems are shared and long-lived, and some process sensitive data, so that safety violations such as deadlocks and incompatible messaging patterns or data payloads between senders and receivers can have catastrophic and unexpected consequences [10].

Our proposal for safety assurance for message-passing programs is based on multiparty session types [13]. The methodology considers the specification of a global interaction protocol among multiple participants, from which we can derive a local protocol for an individual participant. Once each program is type-checked against its local protocol, a set of typed programs is guaranteed to run without deadlock or communication mismatches. We based our work on [20], where the authors proposed a programming framework for message-passing parallel algorithms, centering on explicit, formal description of global protocols, and examined its effectiveness through an implementation of a toolchain for the C language. The toolchain uses a language Scribble  [12, 24] for describing the multiparty session types in a Java-like syntax. A simple example of a protocol in Scribble which represents a ring topology between four workers is given below:

A Scribble protocol starts from the keyword Open image in new window, followed by the protocol name, Open image in new window. The role declarations are then passed as parameters of the protocol, which are Open image in new window through to Open image in new window. The Open image in new window protocol describes a series of communications in which Open image in new window passes a message of type Open image in new window to Open image in new window by forwarding through Open image in new window and Open image in new window in that order and receives a message from Open image in new window. It is easy to notice that explicitly describing all interactions among distinct roles is verbose and inflexible: For example, when extending the protocol with an additional role Open image in new window, we must rewrite the whole protocol. On the other hand, we observe that these worker roles have identical communication patterns which can be logically grouped together: Open image in new window receives a message from Open image in new window and the last Open image in new window sends a message to Open image in new window. In order to capture these replicable patterns, we introduce an extension of Scribble with dependent types called Parameterised Scribble (Pabble). In Pabble, multiple participants can be grouped in the same role and indexed. This greatly enhances the expressive power and modularity of the protocols. Here ‘parameterised’ refers to the number of participants in a role that can be changed by parameters.

The following shows our ring example in the syntax of Pabble.
Open image in new window declares workers from Open image in new window to an arbitrary integer Open image in new window. The Open image in new window roles can be identified individually by their indices, for example, Open image in new window refers to the first and Open image in new window refers to the last. In the body of the protocol, the sender, Open image in new window, declares multiple Open image in new windows, bound by the bound variable Open image in new window, and iterates from Open image in new window to Open image in new window. The receivers, Open image in new window, are calculated on their indices for each instances of the bound variable Open image in new window. The second line is a message sent back from Open image in new window to Open image in new window.
The above code shows the local protocol of Open image in new window, projected with respect to the parameterised Open image in new window role. The projection for a parameterised role, such as Open image in new window, will give a parameterised local protocol. It represents multiple end points in the same logical grouping.

Challenges The main technical challenge for the design and implementation of parameterised session types is to develop a method to automatically project a parameterised global protocol to a parameterised local protocol ensuring termination and correctness of the algorithm.

Unfortunately, as in the indexed dependent type theory in the \(\lambda \)-calculus [2, 33], the underlying parameterised session type theory [9] has shown that the projection and type checking with general indices are undecidable. Hence, there is a tension between termination and expressiveness to enable concise specifications for complex parameterised protocols.

Our main approach to overcome these challenges is to make the theory more practical by extending Scribble with index notation originating from a widely used text book for modeling concurrent Java [16]. For example, notations Open image in new window and Open image in new window in the Ring protocol are from [16]. Interestingly, this compact notation is not only expressive enough to represent representative topologies ranging from parallel algorithms to distributed web services, but also offers a solution to cope with the undecidability of parameterised multiparty session types.

1.1 Overview

Figure 1 shows the relationships between the three layers: global protocols, local protocols and implementations. (1) A programmer first designs a global protocol using Pabble. (2) Then, our Pabble tool automatically projects the global protocol into its local protocols. (3) The programmer then either implement the parallel application using the local protocol as specification, or type-check existing parallel applications against the local protocol. If the communication interaction patterns in the implementations follow the local protocols generated from the global protocol, this method automatically ensures deadlock-free and type-safe communication in the implementation. In this work, we focus on the design and implementation of the language for describing parallel message-passing-based interaction as global and local protocols in (1) and (2) and outline how a Pabble local type checker for MPI (3) can be implemented.
Fig. 1

Pabble programming workflow

This article presents a full version of the work published in [19] which had a particular focus on modeling and expressing communication topologies in parallel applications. Apart from including the detailed proofs for the well-formedness conditions and a number of additional examples, we include use cases from web services and large-scale distributed cyber-infrastructures to show the flexibility of the Pabble language for compact parametric protocols outside the field of high-performance parallel applications. We also expand the related work for a more thorough survey and discussion on formal verification with MPI-based parallel applications.

The contributions of this article are:
  • The first design and implementation of parameterised session types in a global protocol language (Pabble) (Sect.  2.2). The protocols can represent complex topologies with arbitrary number of participants, enhancing expressiveness and modularity for practical message-passing parallel programs.

  • The projection algorithm for Pabble to check the well-formedness of parameterised global protocols (Sects.  2.3 and 2.4) and to generate parameterised local protocols from well-formed parameterised global protocols (Sect. 2.5). A correctness and termination proof of the projection algorithm is also presented (Sect. 2.7).

  • A number of Pabble use cases in parallel programming and web services in Sect. 3.

Additional use cases of Pabble such as common interaction patterns for high-performance computing described in Dwarfs [1] can be found on the project web page [21]. We also outline a methodology for type checking source code written with MPI against Pabble protocols in Sect. 4.

2 Pabble: parameterised Scribble

Scribble  [24] is a developer-friendly notation for specifying application-level protocols based on the theory of multiparty session types [3, 13]. This section introduces an evolution of Scribble with parameterised multiparty session types (Pabble) defines its end point projection and proves its correctness.

2.1 The Pabble protocol language

The core elements of a Pabble protocol are interaction statements, choices and iterations. These are features common also to the Scribble language, which Pabble is extended from. Hence, Scribble protocols are compatible with Pabble, but the most expressive features such as role parameterisation can only be found in Pabble.

Interaction statements describe the messages passed between distributed participants of the protocol. For example, in the Open image in new window protocol below, Open image in new windowOpen image in new window; is an interaction statement which sends a message from participant (called a role) Open image in new window to another participant Open image in new window. The participant are declared in the protocol as arguments of the protocol, Open image in new window. The subscripting notation of the roles are for indexing the participants and will be explained in details the next section. The message has a label, Open image in new window, which may be omitted from the interaction statement. The message also contains a type name as parameters to the label, e.g.,  Open image in new window, called a payload type. The payload type represents the data type of the message being sent.
Choice statements are written as
where each of the branches is an alternative interaction sub-pattern which the participants can collectively select. The deciding role sends a label (e.g., Open image in new window) to other roles involved with the choice to notify them of the branch taken.

Iterations (loops) in the interaction patterns are written as recursion blocks Open image in new window, with Open image in new windowOpen image in new window statement to jump back to beginning of recursion.

2.2 Syntax of Pabble

2.2.1 Global protocols

Figure 2 lists the core syntax of Pabble, which consists of two protocol declarations, global and local. A global protocol is declared with the protocol name (\( str \) denotes a string) with role and group parameters followed by the body \(G\). Role \(R\) is a name with argument expressions. The argument expressions are ranges or arithmetic expressions \(h\), and the number of arguments corresponds to the dimension of the array of roles: for example, Open image in new window denotes a 2D array with size 4 and 2 in the two dimensions, respectively, forming a 4-by-2 array of roles.
Fig. 2

Pabble syntax

Declared roles can be grouped by specifying a named group using the keyword Open image in new window, followed by the group name and the set of roles. For example,

creates a group which consists of two Open image in new windows. A special built-in group, Open image in new window, is defined as all processes in a session. We can encode collective operators such as many-to-many and many-to-one communication with Open image in new window, which will be explained later.

Apart from specifying ranges explicitly, ranges can also be specified using expressions. Expression \(e\) consists of the usual operators for numbers, logarithm, left and right logical shifts \((\mathtt{<}\mathtt{<}, \mathtt{>}\mathtt{>})\), numbers, variables (\(i, j, k\)), and constants \((\mathtt{M}, \mathtt{N})\). Constants are either bound outside the protocol declaration or are left free (unbound) to represent an arbitrary number. As in [16], when the constants are bound, they are declared by numbers outside the protocol, e.g., Open image in new window or lower and upper bounds, e.g., Open image in new window. We also allow leaving the declaration free (unbound), e.g., Open image in new window, as a shorthand to represent an arbitrary constant with lower and upper bounds Open image in new window and Open image in new window, respectively, i.e., Open image in new window, where Open image in new window is a special value representing the maximum possible value or practically unbounded. Binding range expression \(b\) takes the form of \(i:e_1..e_n\) which means \(i\) is ranged from \(e_1\) to \(e_n\). Binding variables always bind to a range expression and not individual values. We shall explain the use of binding range expressions later in more details.

In a global protocol \(G\), \(l(T)\)Open image in new window\(R_1\)Open image in new window\(R_2\) is called an interaction statement, which represents passing a message with label \(l\) and type \(T\) from one role \(R_1\) to another role \(R_2\). \(R_1\) is a sender role and \(R_2\) is a receiver role. \( \,R\)\(\{G_1\}\)Open image in new window\(\{G_{n} \}\) means the role \(R\) will select one of the global types \(G_1,\ldots ,G_n\). Open image in new window\(\{G\}\) is recursion with the label \(l\) which declares a label for Open image in new window statement. Open image in new window denotes a for-loop whose iteration is specified by \(b\). For example, Open image in new window represents the iteration from \(1\) to \(n\) of \(G\) where \(G\) is parameterised by \(i\).

Finally, Open image in new window\(op_c(T)\) means all processes perform a distributed reduction in value with type \(T\) with the operator \(op_c\) (like Open image in new window in MPI). It takes a mandatory predefined operator \(op_c\) where \(op_c\) must be a commutative and associative arithmetic operation. Pabble currently supports sum and product.

We allow using simple expressions (e.g., Open image in new window) to parameterise ranges. In addition, indices can also be calculated by expressions on bound variables (e.g., Open image in new window) to refer to relative positions of roles.

These restrictions on indices such as bound variables and relative indices calculations ensure termination of the projection algorithm and type checking. The binding conditions are discussed in the next subsection.

2.2.2 Local protocols

Local protocol \(L\) consists of the same syntax of the global type except the input from \(R\) (receive) and the output to \(R\) (send). The main declaration
means the protocol is located at role \(R_e\). We call \(R_e\)the endpoint role. In Pabble, multiple local protocol instances can reside in the same parameterised local protocol. This is because each local protocol is a local specification for a participant of the interaction. Where there are multiple participants with a similar interaction structure that fulfill the same role in the protocol, such as the Open image in new windows from our Open image in new window example from the introduction, the participants are grouped together as a single parameterised role. The local protocol for a collection of participants can be specified in a single parameterised local protocol, using conditional statements on the role indices to capture edge cases. For example, in a general case of a pipeline interaction, all participants receives from a neighbor and send to another neighbor, except the first participant which initiates the pipeline and is only a sender and the last participant which ends the pipeline and does not send. In these cases, we use conditional statements to guard the input or output statements. To express conditional statements in local protocols, Open image in new window\(R\) may be prepended to input or output statement. Open image in new window\(R\) input/output statement will be ignored if the local role does not match \(R\). More complicated matches can be performed with a parameterised role, where the role parameter range of the condition is matched against the parameter of the local role. For example, Open image in new window will match Open image in new window but not Open image in new window. It is also possible to bind a variable to the range in the condition, e.g., Open image in new window, and Open image in new window can be used in the same statement.

2.3 Well-formedness conditions: index binding

As Pabble protocols include expressions in parameters, a valid Pabble protocol is subject to a few well-formedness conditions. Below, we show the conditions which ensure indices used in roles are correctly bounded. We use \(\mathsf {fv}/\mathsf {bv}\) to denote the set of free/bound variables defined as \(\mathsf {fv}(i)=\{i\}\), \(\mathsf {fv}(\mathtt{N})=\mathsf {fv}( num )=\emptyset \) and \(\mathsf {fv}(i:e_1 \dots e_n)=\cup \mathsf {fv}(e_j)\) and
Others are inductively defined.
  1. 1.

    In a global protocol role declaration, which are parameters of Open image in new window, indices outside of declared range are invalid, for example, a role Open image in new window is invalid if the role is declared Open image in new window.

     
  2. 2.
    Let
    1. (a)
      Suppose an interaction statement \(l\)(\(T\)) Open image in new window\(R_1\)Open image in new window\(R_2 \) appears in \(G\). Let \(R_1 = Role _1[h_1]\dots [h_n]\) and \(R_2 = Role _2[e'_1]\dots [e'_m]\) (we assume \(n=0\) (resp. \(m=0\)) if \(R_1\) (resp. \(R_2\)) is either a single participant or group).
      1. (1)

        \(n=m\) (i.e., the dimensions of the parameters are the same)

         
      2. (2)

        \(\mathsf {fv}(h_j)\subseteq \cup \mathsf {bv}(b_i)\) (i.e., the free variables in the sender roles are bound by the for-loops).

         
      3. (3)

        \(\mathsf {fv}(e'_j)\subseteq (\cup \mathsf {bv}(b_i))\cup \mathsf {bv}(h_j)\) (i.e., the free variables in the receiver roles are bound by either the for-loops or sender roles);

         
       
    2. (b)
      Suppose a choice statement Open image in new window\(R\)
      appears in \(G\). Then, \(R\) is a single participant, i.e., either \( Role \) or \( Role [e]\) with \(\mathsf {fv}(e)\subseteq (\cup \mathsf {bv}(b_i))\).
       
     
Condition 2(a)(1) ensures the number of sender parameters matches the number of receiver parameters. For example, the following is invalid:

Condition 2(a)(2) ensures variables used by a sender are declared by the enclosing for-loops.

Condition 2(a)(3) makes sure the receiver parameter at the j-th position is bound by the for-loops or the sender parameter at the j-th position (and not binders at other positions). For example, the following is valid:
But with the index swapped, it becomes invalid:

Condition 2(b) is similar for the case of Open image in new window statements where \(R\) should be a single participant to satisfy the unique sender condition in [6, 8].

2.4 Well-formedness conditions: constants

In Pabble protocols, constants can be defined by
  1. (1)

    A single numeric value (Open image in new window); or

     
  2. (2)

    Lower and upper bound constraints not involving Open image in new window (Open image in new window).

     
Lower and upper bound constraints are designed for runtime constants, e.g., the number of processes spawned in a scalable protocol, which is unknown at design time and will be defined and immutable once the execution begins. To ensure, Pabble protocols are communication-safe in all possible values of constants, we must ensure that all parameterised role indices stay within their declared range. Such conditions prevent sending or receiving from an invalid (non-existent) role which will lead to communication mismatch at runtime.

In case (1), the check is trivial. In case (2), we require a general algorithm to check the validity between multiple constraints appeared in the regions. First, we formulate the constraints of the values of the constants as a series of linear inequalities. We then combine the linear inequalities and determine the feasible region using standard linear programming. The feasible region represents the pool of possible values in any combination of the constraints. The following explains how to determine whether the protocol will be valid for all combinations of constants:

The basic constraints from the constants are:
We then calculate the range of Open image in new window as Open image in new window. Since the objective is to ensure that the role parameters in the protocol body (i.e., Open image in new window and Open image in new window) stay within the bounds of Open image in new window, we define a constraint set to be:
which are lower and upper bound inequalities of the two ranges. From them, we obtain this inequality as a result:
By comparing this against the basic constraints on the constants, we can check that not all outcomes belong to the regions, and thus, this is not a communication-safe protocol (an example of a unsafe case is Open image in new window and Open image in new window). On the other hand, if we alter Line 4 to Open image in new window;, the constraints are unconditionally true, and so, we can guarantee all combinations of constants Open image in new window and Open image in new window will not cause communication errors.

Arbitrary constants In addition to constant values and lower and upper bound constants, we also consider the use cases when the value of a constant can be any arbitrary value in the set of natural numbers. This is an extension of case (2) with the Open image in new window keyword, where we write Open image in new window to represent a range without upper bound.

In order to check that role indices are valid with unbounded ranges, we enforce two simple restrictions. First, only one constant can be defined with Open image in new window in one global protocol. Secondly, when the index is unbounded, its range calculation only uses addition or subtraction on integers (e.g., Open image in new window).

A protocol with an invalid use of arbitrary constants is shown below:

If Open image in new window is instantiated to Open image in new window, then the role is declared to be Open image in new window. In the first interaction statement, Open image in new window is invalid, as Open image in new window is not in the range of Open image in new window. In the second statement, Open image in new window is also invalid, as it evaluates to Open image in new window and is out of range Open image in new window.

On the other hand, the following protocol is valid since the indices always stay between 0 and Open image in new window.

We have shown in [21], most of representative topologies with the arbitrary number of participants can be represented under these conditions.

2.5 Endpoint projection

In the next step, a Pabble protocol should be projected to a local protocol, which is a simplified Pabble protocol as viewed from the perspective of a given end point. The projection algorithm is explained below. To begin with the header of the global protocol

where the protocol name name and parameters param are preserved and the endpoint role \(R_e\) is declared.

Table 1 shows the projection of the body of global protocol \(G\) onto \({\mathbf {R}}\) at endpoint role \(R_e\). The projection rules will be applied from top to bottom in the table, if a global protocol matches multiple rules, then there will be more than one line of projected protocol for a single global protocol. In Rules 1–4, we show the rule for the single argument as the same rule is applied to \(n\)-arguments. Each rule is applied if \({\mathbf {R}}\) meets the condition in the second column under the constraints given by the constant declarations. Rules 1 and 2 show the projection of the interaction statement when \({\mathbf {R}}\) appears in the receiver and the sender position, respectively. Since \({\mathbf {R}}\) is a single participant, it should satisfy \({\mathbf {R}}=R_e\) (i.e., the role is the endpoint role). The projection simply removes the reference to role \({\mathbf {R}}\) from the original interaction statement.
Table 1

Projection of \(G\) onto \({\mathbf {R}}\) at the end point role \(R_e\)

\(L\) and \(L_i\) correspond to the projection of \(G\) and \(G_i\) onto \({\mathbf {R}}\)

Rules 3 and 4 show the projection of an interaction statement if role \({\mathbf {R}}\) is a parameterised single participant where \({\mathbf {R}}\) is an element of the endpoint role \(R_e\). For example, if \(R_e=\)Open image in new window, \({\mathbf {R}}\) can be either Open image in new window, Open image in new window or Open image in new window. In addition to removing the reference of role \({\mathbf {R}}\) in the receive and send statements, we also prepend the conditions which the role applies. The order of which the projection rules are applied ensure that an interaction statement will be localized to receive then send. In general, both receive–send or send–receive in the projected local protocol are correct, as long as the projection algorithm is consistent and the well-formedness conditions of the global protocol are satisfied. The global protocol will ensure, by session typing, that a send will have a matching receive at the same stage of the protocol.

Rule 5 is for all-to-all communication. Any role \({\mathbf {R}}\) will send a message with type \(U\) to all other participants and will receive some value with type \(U\) from all other participants. Since all participants start by first sending a message to all, no participant will block waiting to receive in the first phase, so no deadlock occurs.

Rules 6 and 7 are the projection rules for the case that we project onto a group. We need to check that a group is a subset of the endpoint role \(R_e\) with respect to the group declarations in the global protocol. Then, the rules can be understood as Rules 3 and 4.

Rules 8 and 9 show the projection of interaction statements with parameterised roles using relative indexing (we show only one argument: the algorithm can be extended easily to multiple arguments using the same methods). Rule 8 uses two auxiliary transformations of expressions, Open image in new window and Open image in new window. Table 2 lists their examples. Open image in new window takes two arguments, a range with binding variable (\(b\)) and an expression using the binding variable (\(e\)). The expression is applied to both ends of the range to transform the relative expression into a well-defined range. Open image in new window calculates the inverse of a given expression, for example, the inverse of Open image in new window is Open image in new window and the inverse of Open image in new window is Open image in new window. In cases when an inverse expression cannot be derived, such as Open image in new window, the expression will be calculated by expanding to all values in the range and instantiating every value bound by its binding variable (e.g., Open image in new window).
A concrete example is given as follows, to project the statement
the statement will be expanded to
before applying the projection rules. In order to perform the range expansion above, the beginning and the end of the range must be known at projection time. For this reason, the projection algorithm returns failure if a statement uses parameterised roles with such expressions and the range of the expressions is defined with arbitrary constants (see Sect. 2.4). Otherwise, the expressions might expand infinitely and not terminate. This is the only situation which projection may fail, given a well-formed global protocol. The condition \({\mathbf {R}}[b] \subseteq R_e\) of Rule 9 means the range of \(b\) is within the range of the endpoint role \(R_e\). For example, Open image in new window\(\subseteq \)Open image in new window.

If a projection role matches the choice role (\({\mathbf {R}}\) in Open image in new window\({\mathbf {R}}\)) (Rule 10), then it means a selection statement, whose action is selecting a branching by sending a label. The child or blocks (\(L_1\)...\(L_N\)) are recursively projected, whereas if a projection role does not match the choice role (Rule 11), then the choice statement represents a branch statement, which is the dual of the selection. For recursion (Rule 12), continue (Rule 13) and foreach (Rule 14) statements are just kept in the projected endpoint protocol.

2.6 Collective operations

In addition to point-to-point message-passing, collective operations can also be concisely represented by Pabble. End point message-passing statements are interpreted differently depending on the declarations (i.e., parameters) in the global type. Figures 3, 4, 5 and 6 list the four basic messaging patterns and the interpretations of their projections: point-to-point, scatter (distribution), gather (collection) and all-to-all (symmetric distribution and collection). As shown in the figures, the combination of projected local statements and the type (i.e., single participant or group role) of the local role being projected are unique and can identify the communication pattern in the global protocol.
Fig. 3

Point-to-point communication and Pabble representation

Fig. 4

Scatter pattern and Pabble representation

Fig. 5

Gather pattern and Pabble representation

Fig. 6

All-to-all pattern and Pabble representation

2.7 Correctness and termination of the projection

The parameterised session theory which Pabble is based on [9] has shown that, in the general case, projection and type checking are undecidable. Our first challenge for Pabble ’s design is to ensure the termination of well-formed checking and projection, without sacrificing the expressiveness. The theorems and proofs can be found in this section.

Theorem 1

(Termination) Given global protocol \(G\), the well-formed checking terminates; and given a well-formed global type \(G\) and an end point role \(R_e\), projection \(G\) on \(R_e\) always terminates.

Proof

By the definition of the well-formedness conditions in Sect. 2.3, if a free variable appears in the range position, it is bound by either for-loops or the sender role in the interaction statement. In the case of the for-loop, we can apply the same reduction rules of the for-loop of the global types from Sect. 2 and apply the equality rules in [9, Fig. 15]. Hence, one can check, given \(R_e\) and \({\mathbf {R}}\), all of the conditions (in the second column) in Table 1 are decidable. For the projection, the only non-trivial projection rule is Rule 8. The termination of this rule is ensured by the termination of
and
. If
is not defined, we first check \(e\) has the finite range and use Rule 3 and 4 by expanding the interaction statements to all values in the range (as explained in Sect. 2.5). Hence, the projection algorithm always terminates. \(\square \)

Note that the above theorem implies the termination of type checking (see Theorem 4.4 in [9]).

One of the benefits of using Pabble is that it provides the expressiveness required to be able represent collective interactions in MPI. The correctness of projections of these protocols is ensured by the projection rule of the groups in [7]. The special case of \(U\)Open image in new window follows the asynchronous subtyping rules in [18]. The correctness property which relates to ranges of Pabble follows:

Theorem 2

(Range) The indices of roles appearing in a local protocol body do not exceed the lower and upper bounds stated in the global protocol ProtocolName(para) in
.

Proof

If the range relies on case (2), the correctness is ensured by linear programming. Other cases are straightforward since each condition in Table 1 checks whether roles conform to the bounds in the global protocol. \(\square \)

3 Pabble examples

In Sect. 2.5, we describe how to obtain a local Pabble protocol by projection from a Pabble protocol. The local protocol can then be used as a blueprint to implement parallel programs. In this section, we run through two examples of local protocol projection, using a Open image in new window protocol in Sect. 3.1 and a Open image in new window protocol in Sect. 3.2, showing projection of protocols involving point-to-point and multicast collective applications, respectively.

Then, we present Pabble use cases in web services in Sect. 3.3 and remote procedure call (RPC) composition in Sect. 3.4, showing the capabilities of Pabble as a general-purpose parameterised protocol description language.

Finally, we show an implementation of a parallel linear equation solver Sect. 3.5 in MPI following a wraparound mesh protocol designed in Pabble, demonstrating how Pabble can be used in practical programming. Additional Pabble examples from the Dwarfs [1], evaluation metric is available from our web page [21].

3.1 Projection example: Ring protocol

We now run through the projection of the Open image in new window protocol in Sect. 1 as an example. Local protocols are generated from the global protocols. From the perspective of a projection tool, to write a protocol for an endpoint, we start with Open image in new window followed by the name of the protocol and the endpoint role it is projected for. Since the only role of the Open image in new window protocol is Open image in new windowwhich is a parameterised role, we use the full definition of the parameterised role, Open image in new window. Then, we list the roles used in the protocol inside a pair of parentheses, similar to function arguments in a function definition in C. Note that if the projection role is in the list, we exclude it because the local protocol itself is in the perspective of that role; however, since parameterised roles can be used on multiple endpoint roles, we allow parameterised roles to appear in the list of roles in the protocol. The first line of the projected protocol is thus given as follows:
We then copy the recursion statement to the local protocol, which will be present in all projected protocols.
Next, we take the first interaction statement from Open image in new window protocol and project it with respect to Open image in new window, applying the rules listed in Table 1. As the first statement involves a parameterised destination role, we apply Rule 7 to extract the receive portion of the interaction statement. The Open image in new window function is applied to Open image in new window and the relative expression Open image in new window to obtain Open image in new window for the role condition. The Open image in new window of relative expression Open image in new window is Open image in new window, which will form the index of the sender role.
Since Open image in new window also matches the source parameterised role, Rule 8 is applied to get the send portion of the interaction statement.
Then, we move on to the second statement of the global protocol, Open image in new window;. Similar to the previous statement, we apply Rule 3 and Rule 4 to obtain the respective receive and send statements in the local protocol.
Finally, we apply Rule 13 to trivially copy the Open image in new window statement to the local protocol.

The resulting local protocol is the following, as shown in

3.2 Projection example: MapReduce protocol

The following example shows another parameterised protocol, which represents the map–reduce pattern of work distribution and reduction. This example uses a common parallel programming idiom, collective operations. In contrast to the previous example, there are more than one declared role in the protocol, and one of the role is an ordinary nonparameterised role.
In this protocol, the statements involve two roles, one of which is an ordinary role Open image in new window (in the sense that it is non-parameterised), and the other is a parameterised role Open image in new window. The Open image in new window parameterised role represents a group of related roles, but do not expand to multiple explicit message-passing statements. We further declare a group role Open image in new window which include all the Open image in new window roles as members. The statement in Line 2 is a scatter operation by which the Open image in new window distributes a message of type Open image in new window to each of the named endpoints in Open image in new window group, Open image in new window to Open image in new window. The statement in Line 3 is a gather operation, the reverse of the scatter, which the Open image in new window role collects messages of type Open image in new window from the members of the Open image in new window group. Figure 7 depicts the interactions in the protocol.
Fig. 7

Topology of the MapReduce protocol

Listing 2 shows the local protocol of MapReduce at the Open image in new window role. Since Open image in new window is a nonparametric participant, Rule 2 and 1 are applied to get Line 2 and 3, respectively. This results in a protocol body without conditional interactions.
The local protocol of Open image in new window for Open image in new window is similarly derived by applying the projection rules. Since Open image in new window is a group role and a subset of Open image in new window, Rule 6 and 7 are applied to get Line 2 and 3.

3.3 Use case: web services

Pabble is inspired by applications in the domain of parallel programming, but the parametric nature of Pabble as a protocol language allows us to express interactions with more flexibility while keeping the protocols succinct.

Quote Request protocol specification (C-UC-002) is the most complex use case in [32] published by W3C Web Services Choreography Working Group [31].
Fig. 8

Web Services Quote Request Interaction

It describes the interaction between a buyer who interacts with multiple suppliers who in turn interact with multiple manufacturers in order to get a quote for some goods or services.

The basic steps of the interaction is as follows:
  1. 1.

    A buyer requests a quote from a set of suppliers

     
  2. 2.

    All suppliers forward the quote request of the items to their manufacturers

     
  3. 3.

    The suppliers interact with their manufacturers to build the quotes for the buyer, which is then sent back to the buyer

     
  4. 4.
    1. (a)

      Either the buyer agrees with the quotes and place the orders

       
    2. (b)

      Or the buyer modify the quote and send back to the suppliers

       
     
  5. 5.
    In the case, the supplier received an updated quote request (4b)
    1. (a)

      Either the supplier respond to updated quote request by agreeing to it and sending a confirmation message back to buyer

       
    2. (b)

      Or the supplier respond to the update quote request by modifying it and sending back to buyer and the buyer goes back to step 4

       
    3. (c)

      Or the supplier respond to the update quote request by rejecting it

       
    4. (d)

      Or the supplier renegotiate with the manufacturers, in which case we return to step 3

       
     
Figure 8 shows the interactions between different components in the Quote Request use case. We set the generic number S for suppliers and M for manufacturers. The interactions are described as a Pabble global protocol in Listing 4. In the protocol, we omitted the implicit requestIdType from the payload type in all of the messages which keeps track of states of each role in the stateless web transport.

The Buyer initiates the quote request on Line 2, when it broadcasts a Quote() message to all Suppliers. Then, on Line 47 each of the Supps forward the quote requests to their respective Manufacturers and get a reply from each of them by a series of gather and scatter interactions. Next, the Suppliers reply to the Buyer on Line 9, and the Buyer then decides between accepting the offer straight away (Line 14, outcome 4a), or sending a modified quote request (Line 17, outcome 4b). If a Supp received a modified quote, it decides between accepting the modified quote (Line 21, outcome 5a), rejecting the modified quote straight away (Line 29, outcome 5c) or modifying the quote and renegotiating with Buyer (Line 24, outcome 5b). It is also possible that the Supplier renegotiates with its Manufacturers again, so it notifies the Buyer and returns back to the initial negotiation phase (Line 32, outcome 5d). The projected endpoint protocol for Buyer is Listing 5.

3.4 Use case: RPC composition

We present a use case from the Ocean Observatories Initiative Project [28]. The use case describes a high-level Remote Procedural Call (RPC) request/response protocol between layers of proxy services. An application sends a request to a high-level service, and the service is expected to reply to the application with a result. If the service does not provide the requested service, then this high-level service will issue a request to a lower level service which can process the request. This request-response protocol is chained between services in each level until a low-level service is reached.

Figure 9 describes the chaining of RPC-style request/response protocol. A request is routed to the most relevant service provider through multiple proxy services hidden from higher level services. The request routes through a multi-hop path from the requester to the resources. The reply is routed in reverse through the same participant proxy services back to the requester.
Fig. 9

RPC request/response chaining

We represent this series of interactions using a Pabble protocol outlined below. The participants, Open image in new window, represents a proxy service in each of the levels. Open image in new window is the requester and Open image in new window is the actual service provider. A Open image in new window message is sent from a Open image in new window to the Open image in new window in the level directly below, until it reached Open image in new window which will process the request and reply to the higher level service with a Open image in new window. Using a Open image in new window loop with decrementing indices, the Open image in new window is cascaded to the originating service, Open image in new window. The Pabble protocol is shown in Listing 6.
Fig. 10

\(N^2\)-node wraparound mesh topology

As the request and response phase are symmetric and involve the same participants, we are able to compact the multi-layer protocol to only using two Open image in new window loops, each with one parameterised interaction statement. Open image in new window can be an arbitrary constant to allow maximum flexibility in the protocol. This simple and concise representation of complex RPC chaining protocol is possible because of the index notation in Pabble.

3.5 Implementation example: Linear equation solver

Listing 9 shows an example implementation outline for a linear equation solver using a wraparound mesh, which follows the Pabble protocol in Listing 7. The topology is illustrated in Fig. 10. The example is given in message-passing interface (MPI), the standardized API for developing message-passing applications in parallel computing.
The protocol above describes a wraparound mesh that performs a ring propagation between Open image in new window (for worker) in the same row (Line 34), and the result of each Open image in new window row is distributed to all Open image in new windows in the first column (i.e., Open image in new window) using a group-to-group distribution on Line 7. The global protocol is then automatically projected into its local protocol shown in Listing 8 below. Developers can then implement the application using its local protocol as a guide.
Note the similarity of the local protocol and the structure of the MPI implementation in Listing 9. In particular, the conditional send and receive in MPI can directly correspond to the role conditions in the local protocol which was derived from the global protocol by projection.

4 Type checking

Given the local protocol and the implementation, we propose a session type checker to verify the conformance of the implementation against the projected local protocol. Conformance of end point programs against the projected protocol will yield communication-safe parallel programs.

Pabble local protocols have similar structure to that of MPI programs. Both Pabble protocols and MPI programs are designed such that a single source code representing multiple end points, a result of the single-program multiple-data (SPMD) parallel programming model. The core communication primitives of MPI can correspond to Pabble statements, as demonstrated in Listing 9. In addition, collective operations such as broadcast (Open image in new window) or all-reduce (Open image in new window) can be supported by the collective operation correspondence in Sect. 2.5.

Challenges for a complete MPI type checker In [20], Ng et al. introduced a session type checker for a nonparameterised protocol language and a simple session programming API. We face a number of challenges when building a complete type checker using the same methodology for Pabble, which is a dependent protocol language and MPI, which is a standard parameterised implementation API. The Pabble language with its well-formedness checks reduces the undecidability issues in the role representation by using integer instead of general indices. The type checking process will compare the protocol against a simplified, canonical local protocol extracted from the implementation, which still posts a challenge in the process of protocol extraction. In particular, inferring source and destination processes from parametric source code is non-trivial. MPI uses process IDs (or ranks) to identify processes, and it is valid to perform numeric operations on the ranks to efficiently calculate target processes. This allows ways of exploiting the C language features while remaining a valid program. For example, instead of using a conventional conditional statement, an MPI function call of this form may be used:

where the process ID, Open image in new window, is being used as a boolean, thus a straightforward analysis of Open image in new window usages would not be sufficient. In order to correctly calculate target processes of the interactions, it will be necessary to simulate rank calculations by techniques such as symbolic execution or combinations of runtime techniques.

5 Conclusion

This article introduced a new global protocol description language, Pabble, and applied it to ensure deadlock-free and type-safe communications in parallel programs. Local protocols projected from a parameterised global protocol and we outlined a methodology to specify and type-check MPI parallel programs for safe parallel programs. Our global protocols and local protocols bring the expressiveness of Scribble to new levels, overcoming the issue of the underlying parameterised multiparty session type theory [9] by a careful design choice for indices based on [16]. Combining with the multirole theory from [7], Pabble can represent and type-check representative MPI collective operators. We are not aware of any prior framework which is uniformly applicable to a safety guarantee for message-passing parallel programs which run over complex topologies, through static, low-cost type checking as compared to model checking.

Through our examples presented in this article, we have showed that the Pabble language is not limited to high-performance parallel applications. The examples, including web services and RPC, cover a broad category of interaction-centric scalable distributed applications. Our simple, formally based language provides an approach for designing services and applications with safe interaction patterns.

6 Related work

6.1 Formal verification for parallel applications

Formal verification for message-passing parallel programming has been actively studied in the area of MPI parallel applications. A recent survey [10] summarizes a wide range of model checking-based verification methods for MPI. Among them, ISP [29] is a dynamic verifier which applies model-checking techniques to identify potential communication deadlocks in MPI. Their tool uses a fixed test harness and in order to reduce the state space of possible thread interleavings of an execution, the tool exploits an independence between thread actions. Later in [30], the authors improved its scheduling policy to gain efficiency of the verification. While their approach aims to cover common deadlock patterns in MPI programs, it is still limited to a finite number of tests. Our approach does not rely on external testing, and all session typable programs are guaranteed communication-safe and deadlock-free by a low-cost static code generation and type checking.

TASS [26] is another tool that combines symbolic execution [25] and model-checking techniques to verify safety properties of MPI programs. The tool takes a C/MPI application and an input \(n \ge 1\) which restricts the input space, then constructs an abstract model with \(n\) processes and checks its functional equivalence and deadlocks by executing the model of the application. TASS does not verify properties for an unbounded number of communication participants nor treat parameterisation, whereas we can work with message-passing programs where the number of participants is unknown at compile time, if they are written in well-formed, projectable Pabble.

6.2 Formally based MPI languages

Pilot [5] is a parallel programming library built on standard MPI to provide a simplified parallel programming abstraction based upon CSP. The communication is synchronous and channels are untyped to facilitate reuse for different types. The implementation includes an analyser to detect communication deadlock at runtime. Our proposed typechecker is static and is able to detect and prevent deadlocks before execution.

Interprocedural control flow graph (ICFG) [27] and parallel control flow graph (pCFG) [4] are techniques to analyze MPI parallel programs for potential message leak errors. Their approach extends a traditional data-flow analysis by connecting control flow graphs of concurrent processes to their communication edges in order to derive the communication pattern and topology of a parallel program. They take a bottom-up engineering-based approach, in contrast to our formally based, top-down global protocol approach, which can give a high-level understanding of the overall communication by design, in addition to the communication safety assurance by multiparty session types.

6.3 Parameterised multiparty session types

Previous work from Ng et al. [20] introduces a C programming framework based on multiparty session types (MPSTs), but it does not treat parameterisation. Hence, the user needs to explicitly describe all interactions in the protocol, and the type checker does not work if the number of participants is unknown at compile time. Pabble ’s theoretical basis is developed in [9] where parameterised MPSTs are formalized using the dependent type theory of Gödel’s System \({\mathcal {T}}\). The main aim in [9] is to investigate the decidability and expressiveness of parameterisations of participants. Type checking in [9] is undecidable when the indices are not limited to decidable arithmetic subsets or the number of the loop in the parameterised types is infinite. The design of Pabble is inspired by the LTSA tool from a concurrency modeling text book used for the undergraduate teaching in the authors’ university over two decades [16]. The notations for parameterisations from the LTSA tool offers not only practical restrictions to cope with the undecidability of parameterised MPSTs [9], but also concise representations for parameterised parallel languages. Our work is the first to apply parameterised MPSTs in a practical environment and one foremost aim of our framework with Pabble and parameterised notation is to be developer-friendly [24] without compromising the strong formal basis of session types.

6.4 Dependent typing systems

Liquid Type [23] is a dependent typing system to automatically infer memory safety properties from program source code without using verbose annotations. The work [22] introduced an analyser for the C language in the low-level imperative environment based on Liquid Types and refinement types. The recent work on Liquid Types [15] applied the tool with SMT solvers to assist parallelisation of code regions by determining statically whether parallel threads will run on disjoint shared memory without races. Our work applies dependent session types to guarantee different kinds of safety, communication safety and deadlock freedom, in explicit message-passing based distributed and parallel programming rather than shared memory concurrency. It is an interesting future topic to integrate with model-checking tools to handle projectability with more complex indices in addition to functional correctness of session programs.

6.5 Session-based approaches to parallel programming

A recent work [11, 17] aims to use session types for deductive verification of MPI programs. A new type language is designed specifically for MPI and they used VCC, a concurrent C verifier tool to verify correctness of MPI against the type language. While the Pabble language was designed with influences from parallel programming APIs and parallel programming use cases, the language was designed to be an independent high-level abstraction over distributed interactions. As a result, our language makes no assumption about the execution environment (e.g., collective loops in MPI), and allows Pabble to represent general protocols from distributed systems or web services with distinct roles as shown in the examples.

7 Future work

Future works include extending Pabble and the underlying theory with support for modeling process creation and destroy, such as dynamic multirole approach described in [7].

A number of enhancements are planned for Pabble including support for annotations which can complement the protocol description to specify assertions. The type checking process can use the extra constraints or conditions provided to combine with model checkers to also assure functional correctness of the overall application. Annotations will also enable integration with runtime monitoring described in [14] for a combined static and dynamic approach to communication correct application using Pabble.

An approach to generate distributed parallel application is in the works, using a combination of Pabble protocol, which describes the interaction aspects of the application, and computation code, which describes the sequential computation behavior of the application.

Notes

Acknowledgments

The work is funded by EPSRC EP/K034413/1, EP/K011715/1 and EP/L00058X/1, EU project FP7-612985 (UpScale) 257906, 287804 and 318521.

References

  1. 1.
    Asanovic K, Bodik R, Demmel J, Keaveny T, Keutzer K, Kubiatowicz J, Morgan N, Patterson D, Sen K, Wawrzynek J, Wessel D, Yelick K (2009) A view of the parallel computing landscape. Commun ACM 52(10):56–67CrossRefGoogle Scholar
  2. 2.
    Aspinall D, Hofmann M (2005) Advanced topics in types and programming languages, chap. Dependent types. MIT Press, CambridgeGoogle Scholar
  3. 3.
    Bettini L, Coppo M, D’Antoni L, De Luca M, Dezani-Ciancaglini M, Yoshida N (2008) Global progress in dynamically interleaved multiparty sessions. In: CONCUR 2008, LNCS, vol 5201, Springer, Berlin, pp 418–433Google Scholar
  4. 4.
    Bronevetsky G (2009) Communication-sensitive static dataflow for parallel message passing applications. In: CGO’09, IEEE, pp 1–12Google Scholar
  5. 5.
    Carter J, Gardner WB, Grewal G (2010) The Pilot approach to cluster programming in C. In: IPDPSW, IEEE, pp 1–8Google Scholar
  6. 6.
    Castagna G, Dezani-Ciancaglini M, Padovani L (2012) On global types and multi-party session. LMCS 8(1)Google Scholar
  7. 7.
    Deniélou PM, Yoshida N (2011) Dynamic multirole session types. In: POPL, ACM, pp 435–446Google Scholar
  8. 8.
    Deniélou PM, Yoshida N (2012) Multiparty session types meet communicating automata. In: ESOP, LNCS, vol 7211, Springer, Berlin, pp 194–213Google Scholar
  9. 9.
    Deniélou PM, Yoshida N, Bejleri A, Hu R (2012) Parameterised multiparty session types. LMCS 8(4)Google Scholar
  10. 10.
    Gopalakrishnan G et al (2011) Formal analysis of MPI-based parallel programs. Commun ACM 54(12):82–91CrossRefGoogle Scholar
  11. 11.
    Honda K, Marques E, Martins F, Ng N, Vasconcelos V, Yoshida N (2012) Verification of MPI programs using session types. In: EuroMPI’12, LNCS, vol 7490Google Scholar
  12. 12.
    Honda K, Mukhamedov A, Brown G, Chen TC, Yoshida N (2011) Scribbling interactions with a formal foundation. In: ICDCIT, LNCS, vol 6536, Springer, Berlin, pp 55–75Google Scholar
  13. 13.
    Honda K, Yoshida N, Carbone M (2008) Multiparty asynchronous session types. In: POPL’08, pp 273–284Google Scholar
  14. 14.
    Hu R, Neykova R, Yoshida N, Demangeon R (2013) Practical interruptible conversations: Distributed dynamic verication with session types and python. In: RV 2013, LNCS, vol 8174, pp 148–130Google Scholar
  15. 15.
    Kawaguchi M, Rondon P, Bakst A, Jhala R (2012) Deterministic parallelism via liquid effects. In: PLDI’12, pp 45–54Google Scholar
  16. 16.
    Magee J, Kramer J (2006) Concurrency: state models and Java programs, 2nd edn. Wiley, New YorkGoogle Scholar
  17. 17.
    Marques ERB, Martins F, Vasconcelos VT, Ng N, Martins ND (2013) Towards deductive verification of mpi programs against session types. In: PLACES’13, EPTCS, vol 137, pp 103–113Google Scholar
  18. 18.
    Mostrous D, Yoshida N, Honda K (2009) Global principal typing in partially commutative asynchronous sessions. In: ESOP, LNCS, vol 5502, pp 316–332Google Scholar
  19. 19.
    Ng N, Yoshida N (2014) Pabble: parameterised scribble for parallel programming. In: PDP 2014, IEEE (to appear)Google Scholar
  20. 20.
    Ng N, Yoshida N, Honda K (2012) Multiparty session C: safe parallel programming with message optimisation. In: TOOLS, LNCS, vol 7304, Springer, Berlin, pp 202–218Google Scholar
  21. 21.
  22. 22.
    Rondon PM, Kawaguchi M, Jhala R (2010) Low-level liquid types. In: POPL’10, pp 131–144. http://dl.acm.org/citation.cfm?id=1375602
  23. 23.
    Rondon PM, Kawaguci M, Jhala R (2008) Liquid types. In: PLDI’08, pp 159–169Google Scholar
  24. 24.
    Scribble homepage. http://scribble.github.io
  25. 25.
    Siegel S, Mironova A, Avrunin G, Clarke L (2008) Combining symbolic execution with model checking to verify parallel numerical programs. ACM TOSEM 17(2):1–34CrossRefGoogle Scholar
  26. 26.
    Siegel SF, Zirkel TK (2011) Automatic formal verification of MPI-based parallel programs. In: PPoPP’11, ACM Press, p 309Google Scholar
  27. 27.
    Strout M, Kreaseck B, Hovland P (2006) Data-flow analysis for MPI programs. In: ICPP’06, IEEE, pp 175–184Google Scholar
  28. 28.
  29. 29.
    Vo A, Vakkalanka S, DeLisi M, Gopalakrishnan G, Kirby RM, Thakur R (2009) Formal verification of practical MPI programs. In: PPoPP’09, pp 261–270Google Scholar
  30. 30.
    Vo A et al (2010) A scalable and distributed dynamic formal verifier for MPI programs. In: SC’10, IEEE, pp 1–10Google Scholar
  31. 31.
    W3C Web Services Choreography Working Group. http://www.w3.org/2002/ws/chor/
  32. 32.
    Web Services Choreography Requirements. http://www.w3.org/TR/ws-chor-reqs/
  33. 33.
    Xi H, Pfenning F (1998) Eliminating array bound checking through dependent types. In: PLDI ’98, pp 249–257Google Scholar

Copyright information

© The Author(s) 2014

Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Authors and Affiliations

  1. 1.Department of ComputingImperial College LondonLondonUK

Personalised recommendations