1 Introduction

Communication is increasingly important in modern applications, especially for the distributed ones. Remote procedure/method invocations are loosing their prominence at the application level due to their poor scalability, crucial in modern distributed applications. Communication-based modelling is also appealing for non-distributed software. For instance, application-level protocols can be devised to specify the behavioural constraints ensuring the correct use of a library or of an off-the-shelf component. This trend is also witnessed by the growth that service-oriented or cloud computing had in the software industry. In this context, composition of software becomes paramount and requires proper theoretical foundations as well as tool support. In fact, although scalable, communication-centric applications may pose non trivial obstacles to validation.

We showcase CAT, a prototype toolkit supporting the validation of communication-centric applications. This toolkit (available at [3]) is based on contract automata [46], a recently proposed formal model of service composition. Contract automata abstractly describe (the communication pattern of) services as automata whose transitions represent requests and offers. An interaction between two services occurs when a match action is possible, that is when one service’s offer matches a partner’s request. Intuitively, contract automata capture the behaviour of services by tracking the interactions they are keen to execute with each other. Service composition is naturally described in terms of product automata. The matching between offers and requests has to guarantee agreement properties that amount to safe communications. Intuitively, an automaton admits strong agreement if it has at least one trace made only by match transitions; and it is strongly safe if all the traces are in strong agreement. Basically, strong agreement guarantees that the composition of services has a sound execution, while strong safety guarantees that all executions of the composition are sound. Likewise for agreement but for the fact that traces also admit (unmatched) offers to model interactions with an external environment.

By means of an example we describe how the analysis of communication-centric applications can be supported by CAT. To this purpose we borrow here the two-buyers protocol (2BP) from [10] which we now briefly recall. Two buyers, say \(\mathtt {B}_1\) and \(\mathtt {B}_2\), collaborate in purchasing an item from a seller \(\mathtt {S}\). Buyer \(\mathtt {B}_1\) starts the protocol by asking \(\mathtt {S}\) the price of the desired item (\( price \)); the seller \(\mathtt {S}\) makes an offer by sending the message \( quote _1\) to \(\mathtt {B}_1\) and the message \( quote _2\) to \(\mathtt {B}_2\). Once received its quote, buyer \(\mathtt {B}_1\) sends to \(\mathtt {B}_2\) its contribution for purchasing the item (message \( contrib \)). Buyer \(\mathtt {B}_2\) waits for the quote from \(\mathtt {S}\) and the contribution from \(\mathtt {B}_1\). Then, it decides whether to terminate by issuing the \( nop \) message to \(\mathtt {S}\), or to proceed by sending an acknowledgement to \(\mathtt {S}\). In the latter case, \(\mathtt {S}\) sends the item to \(\mathtt {B}_2\) (\( delivery \)), while if it receives \( nop \) it terminates with no further action. Figure 1 shows the contract automata of \(\mathtt {B}_1\), \(\mathtt {B}_2\), and \(\mathtt {S}\) where each interaction is split in offers (over-lined labels) and requests (non over-lined labels).

Fig. 1.
figure 1

The contract automata for 2BP

We will apply CAT to the above protocol and show how, when the agreement property of interest is violated, we identify and fix defects.

2 Background: Contract Automata

Contract automata have been introduced in [4] from which we borrow the following definitions. Intuitively, a contract automaton represents the behaviour of a set of principals capable of performing some actions; more precisely, as formalised in Definition 1, the actions of contract automata allow them to “advertise” offers, “make” requests, or “handshake” on matching offer/request actions. The number of principals in a contract automaton is its rank, and a vectorial representation is used for tracking the moves of each principal in it. The transitions are labelled with tuples in the set where: requests of principals will be built out of \(\mathbb {R}\) while their offers will be built out of \(\mathbb {O}\), \(\mathbb {R}\cap \mathbb {O}= {\emptyset }\), and is a distinguished label to represent components that stay idle. We let \(a,b,c,\ldots \) range over \(\mathbb {L}\) and fix an involution \(\overline{\cdot }: \mathbb {L}\rightarrow \mathbb {L}\) such that

Offer actions are overlined, e.g. \(\overline{a}\). Let \(\vec {v}=(a_1, ..., a_{n})\) be a vector of rank \(r_v\), then \({\vec {v}}_{({i})}\) denotes the i-th element. We write \(\vec {v}_1\vec {v}_2\ldots \vec {v}_m\) for the concatenation of m vectors \(\vec {v}_i\); \(|\vec v| = n\) is the rank of \(\vec v\); and \({\vec v}^n\) for the vector obtained by n concatenations of \(\vec v\).

Definition 1

A tuple on \(\mathbb {L}\) is a request (resp. offer) action on b iff \(b \in \mathbb {R}\) (resp. \(b \in \mathbb {O}\)). A match (action) on b is a tuple on \(\mathbb {L}\) ( \(b \in \mathbb {R}\cup \mathbb {O}\)). Let \(\bowtie \subseteq \mathbb {L}^*\times \mathbb {L}^*\) be the symmetric closure of \(\mathop {\bowtie }\limits ^{\cdot }\subseteq \mathbb {L}^*\times \mathbb {L}^*\) where \(\vec a_1 \mathop {\bowtie }\limits ^{\cdot }\vec a_2\) iff \(|\vec a_1| = |\vec a_2|\) and both the following conditions hold

  • \(\exists b \in \mathbb {R}\cup \mathbb {O}\;:\;\vec a\) is either a request or an offer on b;

  • \(\exists b \in \mathbb {R}\cup \mathbb {O}\;:\;\vec a_1 \text { is an offer on } b \implies \vec a_2 \text { is a request on } b\)

  • \(\exists b \in \mathbb {R}\cup \mathbb {O}\;:\;\vec a_1 \text { is a request on } b \implies \vec a_2 \text { is a offer on } b\).

Definition 2

Assume as given a finite set of states \(\mathfrak {Q}=\{q_1,q_2, \ldots \}\). Then a contract automaton \(\mathcal {A}\) of rank n is a tuple \(\langle Q, \vec {q_0}, A^{r}, A^{o}, T, F \rangle \), where

  • \(Q=Q_1 \times \ldots \times Q_n \subseteq \mathfrak {Q}^n\)

  • \(\vec {q_0} \in Q\) is the initial state

  • \(A^{r}\subseteq \mathbb {R}, A^{o} \subseteq \mathbb {O}\) are finite sets (of requests and offers, respectively)

  • \(F \subseteq Q\) is the set of final states

  • \(T \subseteq Q \times A \times Q\) is the set of transitions, where and if \( (\vec {q},\vec {a},\vec {q'}) \in T\) then both the following conditions hold:

    • \(\vec {a}\) is either a request or an offer or a match

    • if then it must be \({\vec {q}}_{({i})}={\vec {q'}}_{({i})}.\)

A principal is a contract automaton of rank 1 such that \(A^r \cap co(A^o)= {\emptyset }\).

A principal is not allowed to make a request on actions that it offers. We have two different operators for composing contract automata, that interleave or match the transitions of their operands. We only force a synchronisation to happen when two contract automata are ready on their respective request/offer action. These operators represent two different policies of orchestration. The first operator, called product, considers the case when a service S joins a group of services already clustered as a single orchestrated service \(S'\). In the product of S and \(S'\), the first can only accept the still available offers (requests, respectively) of \(S'\) and vice versa. In other words, S cannot interact with the principals of the orchestration \(S'\), but only with it as a whole component.

The second operation of composition, called a-product, puts instead all the principals of S at the same level of those of \(S'\). Any matching request-offer of either contracts are split, and the offers and requests become available again, and are re-combined with complementary actions of S, and viceversa. The a-product turns out to satisfactorily model coordination policies in dynamically changing environments, because the a-product is a form of dynamic orchestration, that adjusts the workflow of messages when new principals join the contract.

We now introduce our first operation of composition; recall that we implicitly assume the alphabet of a contract automaton of rank m to be .

Definition 3

(Product). Let \(\mathcal {A}_i=\langle Q_i,\vec {q_0}_i,A^{r}_i, A^{o}_i, T_i, F_i \rangle , i \in 1 \ldots n\) be contract automata of rank \(r_i\). The product \(\bigotimes _{i \in 1 \ldots n} \mathcal {A}_i\) is the contract automaton \(\langle Q, \vec {q_0}, A^{r}, A^{o}, T, F \rangle \) of rank \(m= \sum _{i \in 1 \ldots n} r_i\), where:

  • \(Q = Q_1 \times ... \times Q_n,\; where\;\vec {q_0}= \vec {q_0}_1 \ldots \vec {q_0}_n\)

  • \(A^{r} =\bigcup _{i \in 1 \cdots n} A^{r}_i, \quad A^{o}=\bigcup _{i \in 1 \cdots n} A^{o}_i\)

  • \(F=\{\vec {q}_1 \ldots \vec {q}_n\mid \vec {q}_1 \ldots \vec {q}_n \in Q, \vec {q}_i \in F_i, i \in 1 \ldots n \}\)

  • T is the least subset of \(Q \times A \times Q\) s.t. \((\vec {q},\vec c, \vec {q}') \in T\) iff, when \(\vec q = \vec q_1 \ldots \vec q_n \in Q\),

    • either there are \(1 \le i < j \le n\) s.t. \((\vec {q}_i,\vec a_i,\vec {q}'_i) \in T_i\), \((\vec q_j, \vec a_j,\vec {q}'_j) \in T_j\), \(\vec a_i \bowtie \vec a_j\) and

    • or there is \(1 \le i \le n\) s.t. \((\vec {q}_i,\vec a_i,\vec {q}'_i) \in T_i\) and

      with \(u = r_1 + \ldots + r_{i-1}\), \(v = r_{i+1} + \ldots + r_n\), and

      \(\vec {q}' = \vec q_1 \ldots \vec q_{i-1}\,\; \vec {q}'_i\,\ \vec q_{i+1} \ldots \vec q_n\) and

      \(\forall j \ne i, 1 \le j \le n, (\vec q_j, \vec a_j,\vec {q}'_j) \in T_j\) it does not hold that \(\vec a_i \bowtie \vec a_j\).

To retrieve the principals involved in a contract automaton obtained through the product introduced above, we introduce the following:

Definition 4

(Projection). Let \( \mathcal {A} =\langle Q, \vec {q_0}, A^{r}, A^{o}, T, F \rangle \) be a contract automaton of rank n, then the projection on the i-th principal is \(\prod ^i( \mathcal {A})=\langle \prod ^i(Q), {\vec {q_0}}_{({i})}, \prod ^i(A^{r}), \prod ^i(A^{o}), \prod ^i(T), \prod ^i(F) \rangle \) where \(i \in 1 \ldots n\) and:

$$ \prod ^i(Q) = \{{\vec {q}}_{({i})} \mid \vec {q} \in Q\} \quad \prod ^i(F)=\{{\vec {q}}_{({i})} \mid \vec {q} \in F\} \quad \prod ^i(A^{r})=\{ a \mid a \in A^{r}, (q,a,q') \in \prod ^i(T)\} $$

We now define the associative product.

Definition 5

(a-Product). Let \( \mathcal {A}_1,\mathcal {A}_2\) be two contract automata of rank n and m, respectively, and let \(I=\{\prod ^i(\mathcal {A}_1) \mid 0< i \le n\} \cup \{ \prod ^j(\mathcal {A}_2) \mid 0 < j \le m \}\). Then the a-product of \(\mathcal {A}_1\) and \(\mathcal {A}_2\) is \( \mathcal {A}_1 \boxtimes \mathcal {A}_{2} = \bigotimes _{\mathcal {A}_i \in I} \mathcal {A}_i\).

3 CAT at Work

We have implemented CAT in Java according to the simple architecture of Fig. 2. The main class of CAT extends JAMATA [3], a framework for manipulating automata yielding methods for loading, storing, printing, and representing finite state automata. In other words, CAT originally specializes JAMATA on contract automata, offering to the developers an API for creating and verifying contract automata. Also, CAT interfaces with a separate module for solving linear optimization problems, called AMPL [8], described in Sect. 5. This is an original facet of CAT; in fact, it maps the (check of) agreement properties of interest on a linear optimization problem.

Fig. 2.
figure 2

The architecture of CAT

The user of CAT has access to its API, that can be conceptually classified as follows:

  • Automata operations consist of the methods CA proj(int i), that returns the automaton specifying the \(i^{th}\) service of the composition, CA product (CA[] aut) and CA aproduct(CA[] aut) that compute respectively the product and the associative product of contract automata. Interestingly, product has to filter out the offers and request transitions when the source state has a corresponding outgoing match transition. Method aproduct is built on top of product by invoking product on the services obtained as projections of the automaton in input.

  • Safety check consists of the instance methods safe, agreement, strong Agreement, and strongSafe returning true if the corresponding agreement property holds on the contract automaton. Section 5 discusses the property of weak agreement.

  • Controllers consist of the methods CA mpc() and CA smpc() that return the most permissive controller (MPC), for respectively agreement and strong agreement. A controller basically represents the largest (strongly) safe sub-automaton and is obtained through a standard construction of Control Theory [7].

  • Liable detection consists of the methods CATransition[] liable() - returning transitions from a state s to a state t such that s is in the MPC but t is not - and CATransition[] strongLiable() that similarly returns such transitions for the MPC of the strong agreement property. In particular, liable services are those responsible for leading a contract composition into a failure.

  • Decentralization includes int[][] branchingCondition(), that returns two states and an action for which the branching condition is violated. Basically, the branching condition holds if the actions of a service are not affected by the states of the other services in the composition. Another similar method that deals with open-ended interactions is int[][] extendedBranchingCondition(). The last method in this category is int[] mixedChoice() that returns a mixed-choice state (a state where a principal has enabled both offers and requests inside matches). All such methods return null when the conditions they check do not hold.

We describe how to interact with the API of CAT through a simple command line interface (we plan to develop a GUI as well). The API is displayed and the user can choose one of the options (this is not shown here). Each displayed option corresponds to one of the methods described above. For instance, after choosing to compute a product, the user is asked to set the contract automata on which to take the product.

figure a

The user inputs the automata in CAT by providing their file names (line 2 of Output1) and yes on line 8 until there are no more automata to load (in which case the user enters no to obtain the result of the product). For each entered automaton, CAT prints a textual description on the screen (lines 4–6 in Output 1) reporting the rank, initial and final states, and the list of transitions. The transitions are triples \(\mathtt {(s,l,t)}\) where \(\mathtt {s}\) is the source state, \(\mathtt {l}\) is the label, and \(\mathtt {t}\) is the target state. These elements are lists of length r (the rank of the automaton), for instance, in Output 1 \(r=1\) (cf. on line 5). The i-th element of each list corresponds to the i-th service. In particular, the i-th action in the list of labels identifies the action performed by the i-th service; such action is strictly positive (if the action is an offer), strictly negative (if it is a request), and 0 if the service is idle in the transition. For B1, actions \(\overline{price}\), \( quote _1\), and \(\overline{contrib}\) are represented with the integers 1, \(-2\), and 3, respectively.

First CAT computes the product automaton, and then it displays the result (B1xB2xS in our example and stores it in a file named B1xB2xS.data). From the main menu, the user can now choose to compute the MPC of the product automaton (shown in Fig. 1); the result is displayed in Output 2 below. Once the product automaton is loaded, CAT will compute the MPC:

figure b

The resulting automaton is of rank 3 and corresponds to the MPC of Fig. 1. The final states are represented as a list where the i-th element is the list of the final states of the i-th service. This representation allows to check if a state of the MPC is final or not without needing to explicitly enumerate all the final states of the MPC.

The transitions on lines 5–8 in Output 2 represent the transitions of the MPC; note that in each transition there is always an idle service. For instance, consider the transition ([0, 0, 0], [1, 0, –1], [1, 0, 1]): it corresponds to the transition of the MPC in Fig. 1 (the second component of the label is 0 because B2 is idle). The MPC can now be saved in a file as per line 9 in Output 2.

The underlying coordination mechanism of contract automata is orchestration. More precisely, services are oblivious of their partners and exchange messages through a “hidden” orchestrator (formalised by the MPC, if any). Whenever possible, one would like to have services interacting without the “supervision” of an orchestrator, using FIFO buffers. Mild conditions [6] ensure that choreographies are sound, in other words that all the interactions among services are successful. We briefly discuss this issue below. For synchronous interactions (where buffers have size 1 and a single buffer may be non empty), services have to enjoy the branching condition that is necessary and sufficient for services to form a sound choreography. As said, a branching condition guarantees “unsupervised” communications soundness when the communication are synchronous [5, 6]. However, such branching condition does not suffice for asynchronous interactions (namely when buffers are unbounded and more than one buffer is possibly non empty). In this case, an additional sufficient and commonly required condition is the absence of mixed choice states, i.e., states where more than one service can perform an offer (see [6]). Consider now Fig. 1, where in Output 3 the state \(\vec {q_2}\) corresponds to [2,0,2], the state \(\vec {q_3}\) to [2,1,3], and the transition to the label [3,–3,0]. The MPC does not enjoy the branching condition, as CAT reports:

figure c

It is important to observe that the message in Output 3 also flags states and transitions for which the condition is violated. We discuss the problem by considering the automata in Fig. 1. The local state of buyer \(\mathtt {B}_1\) in \(\vec {q_2}\) and \(\vec {q_3}\) is \(q_{B12}\), while the locale state of \(\mathtt {B}_2\) in \(\vec q_2\) is \(q_{B20}\), and in \(\vec q_3\) is \(q_{B21}\). Therefore, in the case that \(\mathtt {B_2}\) is in local state \(q_{B20}\) where it is waiting for \(\overline{ quote }_2\), without an orchestrator the offer \(\overline{contrib}\) from \(\mathtt {B}_1\) could fill up the 1-buffer of \(\mathtt {B}_2\), leading to a deadlock. A simple fix consists in swapping the order in which the quotes are sent by the seller; CAT reports that the amended protocol (not shown here) enjoys the branching condition. The contract automaton has no mixed choice states, as detected by CAT. A mixed choice state could be introduced in 2BP if, e.g., \(\mathtt {B}_2\) could send the acknowledgement to \(\mathtt {S}\) or receive \(\overline{contrib}\) from \(\mathtt {B}_1\) in any order. For this variant of 2BP, CAT finds the mixed choice state, so showing that these services do not form a sound choreography.

4 Detailing the Implementation of CAT

CAT consists of a class CAUtil and of other classes CA and CATransition, extending two corresponding super-classes of JAMATA. The class CA provides the main functionalities of CAT; its instance variables capture the basic structure of our automata:

  • int rank is the rank of the automaton;

  • int[] initial is the initial state of the automaton (the array is of size rank);

  • int[] states the vector of the number of local states of each principal in the contract automaton (the array is of size rank);

  • int[][] finalstates the final states of each principal in the contract automaton;

  • CATransition[] tra the transitions of the contract automaton.

The n local states of a principal are represented as integers in the range \(0,\ldots ,n-1\); in this case, states.length = 1 and states[0] = n. The state of an automaton of rank \(m > 1\) is an m-vector states such that states[i] yields the number of states of the \(i^{th}\) principal. This low-level representation (together with the encoding of actions and labels as integers) enabled us to optimize space.

The class CATransition, describes a transition of a contract automaton. The instance variables of a CATransition object are:

  • int[] source (the starting state of the transition);

  • int[] label (the label of the transition);

  • int[] target (the arriving state of the transition).

The class CATransition provides methods to extract its instance variables, to check if the transition is an offer, a request or a match, and to extract the (index of the) principal performing the offer, if any.

5 Linear Programming and Contract Automata

The properties of weak agreement were introduced for solving circularity issues, in which all services are stuck waiting the fulfilment of their requests before providing the corresponding offers [4]. For example, consider the services (rendered as regular expressions) \(A = a.\overline{b}\) and \(B = b. \overline{a}\); their product does not admit agreement. Circularity is solved by allowing matches between requests and offers even though they are not simultaneous; intuitively, offers may be fired “on credit” provided that the corresponding requests are honoured later on. A trace of an automaton is a weak agreement if for each request there is a corresponding offer, no matter in which order they occur in the trace. The notions of admitting weak agreement and of weakly safety are then similar to the ones of (strong) agreement reviewed earlier. For example, \(A \otimes B\) admits weak agreement. The underlying theory and the decision procedures for the properties of weak agreement are developed in [4], and are formalised as mixed linear integer programming. This is because the properties of weak agreement are context-sensitive, and thus no controller can exist, i.e., a contract automaton for enforcing them. Below, we briefly review a component for solving the optimization problems related to contract automata, that complements the functionalities offered by CAT.

The decision procedures are implemented in A Mathematical Programming Language (AMPL) [8], a widely used language for describing and solving optimization problems. In this way, the automatic verification of contract automata under properties of weak agreement exploits efficient techniques and algorithms developed in the area of operational research. We now briefly describe the implementation of the techniques for verifying weak agreement. The script flow.run, to be launched with the command ampl, is described below:

figure d

The script firstly loads the automaton from the file flow.dat (line 4). The description of the automata consists of the number of nodes, the cardinality of the alphabet of actions, and a matrix of transitions for each action a, where there is value 0 at position (st) if there is no transition from state s to state t labelled by a, and respectively 1 or \(-1\) if there is an offer or request transition on a. In this case, the contract automaton described in flow.dat is representative.

Fig. 3.
figure 3

The implementation in AMPL of the optimization problem for deciding weakagreement and weak safety.

The AMPL linear program to load is given as input parameter to the script (line 3). The two optimization problems available are: weakagreement.mod, the file contains the formalization of the optimization problem for deciding whether a contract automaton admits weak agreement, and weaksafety.mod that contains the formalization of the optimization problem for deciding whether a contract automaton is weakly safe.

Both formal descriptions are then solved using the solver cplex, that is the simplex method implemented in C. However it is possible to select other available solvers in the script flow.run (line 2). The execution of the script will prompt to the user the value of variables. As proved in [4], if the variable gamma is non negative then the contract automaton satisfies the given property. Bi-level optimization problems can not be defined directly in AMPL. Therefore, we cannot plainly apply formalisation of [4] for representing weakly liable transitions as an optimization problem. However, different techniques of relaxation of the bi-level problem for over approximating the set of weakly liable transitions can be used, as for example lagrangian relaxation. As future work, we are planning to develop a toolchain for fully integrating the above techniques in CAT, in order to reuse them for the functionalities described in Sect. 3. In particular, CAT will automatically generate a contract automaton description flow.dat, execute the script flow.run and collect the results. The code of weakagreement.mod and weaksafety.mod is depicted in Fig. 3. For further details about CAT, we refer the interested reader to the full documentation, available online at [3].

6 Concluding Remarks

We described CAT, a tool supporting the analysis of communication-centric applications attained with novel techniques based on combinatorial optimization. A non trivial example was used to show main features of CAT.

An interesting application domain for CAT are service-oriented applications. In this context, model-driven approaches have been advocated for the analysis of service composition. In particular, automata have been used as target models to translate BPEL processes [11] in [9, 13]; for instance, constraint automata semantics of REO [1, 2] is used in [12] to analyse web-services. Relations of contract automata with service composition are studied in [46]. The properties verified by CAT have not been considered by other approaches. For example, the identification - even in presence of circular dependencies of services (see Sect. 5) - of liable transitions that may spoil a composition complement the verification done in [12]. We conjecture that it would be possible to define model transformations from contract automata to BPEL which preserve the analysis discussed here.

A model-driven approach would also ease the integration of CAT with e.g., the tools discussed above. This would provide developers with a wide variety of tools for guaranteeing the quality of the composition of services according to different criteria.

The tool is still a prototype; we plan to improve its efficiency, extend it with new functionalities (e.g., relaxation), and improve its usability (e.g., adding a user-friendly GUI and pretty-printing automata). We note that CAT provides a valid support to the analysis of applications. In fact, CAT is able to detect possible violations of the properties of interest (for example branching condition, mixed choice). A drawback of CAT is that it does not support modelling and design of applications. An interesting evolution of CAT would be to add functionalities for amending applications violating properties of interest. For instance, once liable transitions are identified, CAT could suggest how to modify services to guarantee the property. This may also be coupled with the model-driven approach by featuring functionalities tracing transitions in the actual source-code of services.