Abstract
Fault-tolerant distributed algorithms play an increasingly important role in many applications, and their correct and efficient implementation is notoriously difficult. We present an automatic approach to synthesise provably correct fault-tolerant distributed algorithms from formal specifications in linear-time temporal logic. The supported system model covers synchronous reactive systems with finite local state, while the failure model includes strong self-stabilisation as well as Byzantine failures. The synthesis approach for a fixed-size network of processes is complete for realisable specifications, and can optimise the solution for small implementations and short stabilisation time. To solve the bounded synthesis problem with Byzantine failures more efficiently, we design an incremental, CEGIS-like loop. Finally, we define two classes of problems for which our synthesis algorithm obtains solutions that are not only correct in fixed-size networks, but in networks of arbitrary size.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Byzantine Failures
- Linear Time Temporal Logic (LTL)
- Fixed Network Size
- Synthesis Problem
- Fault-tolerant Distributed Algorithms
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Distributed algorithms are hard to implement. While multi-core processors, communicating embedded devices, and distributed web services have become ubiquitous, it is very hard to correctly construct such systems because of the interplay between separate components and the possibility of uncontrollable faults.
While verification methods try to prove correctness of a system that has been implemented manually, the goal of synthesis methods is the automatic construction of systems that satisfy a given formal specification. The difference between these approaches is shown in Fig. 1, illustrating how synthesis can relieve the designer from tedious and error-prone manual implementation and bug-fixing. Despite these benefits, formal methods that guarantee correctness a priori, like synthesis, have hardly found their way into distributed system design. This is in contrast to a posteriori methods like verification, which are being studied very actively [35, 45, 46].
One reason for this is that the underlying computational problems in synthesis are even harder than in verification. However, research on synthesis has picked up again in recent years [1, 8, 24, 36, 52], pushed forward by advances in theorem proving [26, 42] and model checking algorithms [10, 40] that can be used as building blocks for efficient synthesis algorithms.
In particular, also the synthesis of concurrent and distributed systems has received more attention lately. However, research in this area is to a large extent still restricted to basic theoretical problems [23, 24] or to simpler sub-problems, such as synthesis of synchronisation for existing programs [12, 54].
Failure resilience is critical in this setting, for two reasons: firstly, it enables use in safety-critical applications with weaker assumptions on the environment and the component itself. Secondly, it is needed to ensure scalability in practice, since in large networks it is unrealistic to assume that all components work without failure. In this work, we consider two kinds of failures: transient failures, as exemplified by self-stabilising systems [16, 53], where the whole system can be transported to an arbitrary state; and permanent failures, as exemplified by Byzantine failures [38], where some processes can deviate arbitrarily from the algorithm they should be executing.
Previous approaches for the synthesis of failure-resilient systems are either restricted in the systems that are considered [14], or in the kinds of failures that are supported [21].Footnote 1 For systems that support both self-stabilisation and Byzantine failures, the only result known to us is a problem-specific and semi-automatic approach by Dolev et al. [15], explained in the following.
Motivating Example [15]. Consider the problem of automatically constructing a distributed counter, ranging over m possible values. Processes in the system are arranged in a fully informed clique with synchronous timing, and should satisfy the following properties under self-stabilisation and Byzantine failures:
-
(a)
agreement: at every turn, all processes output the same value;
-
(b)
increment: the value is incremented in each step (mod m).
Since increment is an easy-to-implement local property, the main problem is agreement on the value. Dolev et al. [15] have recently shown a semi-automatic approach to obtain solutions for this problem. For a fixed number \({f}\) of Byzantine nodes, they construct the distributed algorithm in two steps:
-
1.
synthesis for a clique of sufficient size \({n}\);
-
2.
extension to arbitrarily many processes.
The first step is based on a problem-specific encoding of the synthesis problem into a SAT problem. For a fixed number \({n}\) of processes, the approach finds a solution – if one exists – by searching for implementations of increasing size and with increasing stabilisation time. The sufficient size of the clique is \({n}= 3\cdot {f}+ 1\), since this guarantees that (after stabilisation) the Byzantine nodes cannot change the majority value [43]. Therefore, processes that are added to a correct system will be correct if they simply replicate the majority output of the existing processes.
The results of Dolev et al. [15] are impressive, since their solutions for the 2-counting problem extend to systems of arbitrary size, and have smaller state space and stabilisation time than any hand-crafted solution before. However, application of their approach to other problems requires significant effort for the development of a problem-specific encoding, and for proving its correctness.
In contrast, we introduce a general-purpose method for synthesis of failure-resilient systems that is fully automatic, can easily be proven correct, and is applicable to a wide range of problems. In particular, our preliminary implementation can replicate the results of Dolev et al. [15] and extend them to n-counting (with \(n>2\)).
Contributions. In this paper, we propose a novel approach for the automatic synthesis of Byzantine-tolerant self-stabilising systems, in the form of distributed labelled transition systems. Our synthesis method takes as input a description of the network of processes and a specification in linear-time temporal logic, as well as a bound on the number of Byzantine processes in the network. It encodes the existence of a solution into a problem in satisfiability modulo theories (SMT), and synthesises correct implementations for all processes, if they exist.
We show that our method is correct and complete, and will terminate if a bound on the size of process implementations is given.
The first-order problems that result from our encoding critically need quantification over finite, but possibly large, sets. We provide a dedicated approach to solve those problems incrementally. On a prototype implementation of the approach, we show that this makes our examples tractable.
Finally, we give new results for extending our synthesis method from networks of fixed size to families of networks of unbounded size, based on the notion of cutoffs and the Parameterised Synthesis approach [30]. In particular, we define colourless specifications (or tasks) for non-terminating systems in cliques and similar network topologies, as well as a class of local specifications for networks with a fixed degree. For colourless specifications, we obtain cutoffs that depend on the number of Byzantine nodes, while for local specifications we obtain cutoffs that depend on the stabilisation time.
Structure. We introduce our system model and class of specifications in Sect. 2, and the basic synthesis approach in Sect. 3. We present the incremental approach for solving our synthesis problem in Sect. 4, the extension to parametric networks in Sect. 5, and experimental results in Sect. 6.
2 System and Failure Model, Specifications
We consider distributed systems that are defined by a fixed network of finite-state processes, in a synchronous composition: in every global step of the system, each process observes the outputs (possibly the complete state) of neighbouring components, and makes a transition. Our composition models atomic snapshot, the classical communication model for self-stabilising systems [13]. Furthermore, synchronous timing (possibly as an abstraction of the system behaviour) is a standard assumption when reasoning about consensus problems, as these problems are undecidable in asynchronous networks in the presence of faults [25, 39, 47]. To support asynchronous systems, one option is to use an abstraction to an effectively synchronous system, like for example in the model based on “communication rounds” by Dragoi et al. [18].
In the following, we formalise these notions for the case of fixed-size networks. We will consider networks of parametric size in Sect. 5.
2.1 System Model
Labelled Transition Systems. For given finite sets \(\varSigma \) of inputs and \(\varUpsilon \) of labels – or outputs – a \(\varUpsilon \) -labelled \(\varSigma \) -transition system (or short: a \((\varUpsilon ,\varSigma )\)-LTS) \(\mathcal {T}\) is a tuple \(({T},{T}_0,\tau ,o)\) of a set \({T}\) of states, a set \({T}_0\subseteq {T}\) of initial states, a transition function \(\tau : {T}\times \varSigma ~\rightarrow ~{T}\) and a labelling (or output) function \(o: {T}~\rightarrow ~\varUpsilon \). \(\mathcal {T}\) is called finite if \({T}\) is finite.
We consider \(\varUpsilon = 2^O\) and \(\varSigma = 2^I\), representing valuations of a set of Boolean output variables O (controlled by the system) and a set of Boolean input variables I (not controlled by the system).
Communication Graphs, Symmetry Constraints. A communication graph \(C\) is a tuple \(({V},{X}, \mathcal {I},\mathcal {O})\), where \({V}\) is a finite set of nodes, \( {X}\) is a set of system variables, and \(\mathcal {I}: {V}\rightarrow \mathcal {P}({X})\), \(\mathcal {O}: {V}\rightarrow \mathcal {P}({X})\) assign sets of input and output variables to the nodes, with \(\mathcal {O}(v) \cap \mathcal {O}(v') = \emptyset \) for all \(v\ne v' \in {V}\). For a given \(v\), we call \((\mathcal {I}(v),\mathcal {O}(v))\) the interface of \(v\), and \((\left| \,{\mathcal {I}(v)}\, \right| ,\left| \,{\mathcal {O}(v)}\, \right| )\) the type of the interface of \(v\). If \(\mathcal {I}(v) \cap \mathcal {O}(v') \ne \emptyset \), i.e., an output of \(v'\) is an input of \(v\), then we say that \(v\) and \(v'\) are neighbours in \(C\). Variables that are not assigned (by \(\mathcal {O}\)) as output variables to any of the nodes are global input variables, controlled by the environment. Denote this set of variables as \(\mathcal {O}(env)\).
The communication graph may come with a symmetry constraint, given as a partitioning \({V}_1 \dot{\cup }\ldots \dot{\cup }{V}_m = {V}\) of the set of nodes. We assume that for every element \({V}_i\) of the partition, nodes \(v, v' \in {V}_i\) have the same type of interface, and that interfaces of all nodes have a fixed order that identifies corresponding in- and outputs of \(v\) and \(v'\). The intended semantics is that nodes in the same element of the partition should have the same implementation modulo this correspondence.
Distributed Systems. An implementation of a node \(v\in {V}\) in a communication graph \(C\) is a \((2^{\mathcal {O}(v)}, 2^{\mathcal {I}(v)})\)-LTS. A distributed system is defined by a communication graph \(C\) and a finite family \(\left( \mathcal L_v\right) _{v\in {V}}\) of implementations.
Let \(C= ({V},{X},\mathcal {I},\mathcal {O})\) be a communication graph with \({V}= \{ v_1, \ldots , v_{{n}} \}\), and for every \(v_i \in {V}\) let \(\mathcal L_i = (L_i, L_{0,i}, \tau _i, o_i)\) be an implementation of \(v_i\) in \(C\). The composition of \(\left( \mathcal L_v\right) _{v\in {V}}\) in \(C\) is the \((2^{X},2^{\mathcal {O}(env)})\)-LTS \(\mathcal G= (G, G_0, \tau , o)\) with:
-
\(G= L_1 \times \ldots \times L_{{n}}\),
-
\(G_0 = L_{0,v_1} \times \ldots \times L_{0,v_{{n}}}\),
-
\(o(l_1,\ldots ,l_{{n}}) = o_1(l_1) \cup \ldots \cup o_{{n}}(l_{{n}})\), and
-
\(\tau ((l_{1},\ldots ,l_{{n}}),e) = ((\tau _{1}(l_{1}, \sigma _{1}),\ldots ,\tau _{{n}}({t}_{{n}}, \sigma _{{n}}))\), where \(\sigma = o(l_1, \ldots , l_{{n}}) \cup e\) and \(\sigma _i\) is the restriction of \(\sigma \) to variables in \( \mathcal {I}(v_i)\).
Note that this is essentially the same formalism as in Finkbeiner and Schewe’s seminal paper [24], and in the following we re-use part of their work on encoding the synthesis problem for such systems into SMT.
2.2 Failure Model
We consider two kinds of failures: transient failures that are limited in time, but may affect the whole system, and permanent failures that are limited in their locations, i.e., only affect a subset of the processes. We model these failures as self-stabilisation and Byzantine failures, respectively. The conjunction of both kinds of failures is called Byzantine tolerant self-stabilisation [17].
Self-Stabilisation. Self-stabilisation is the strongest model for transient failures, introduced by Dijkstra [13, 16, 53]; it assumes that the system as a whole fails – once – and is put in an arbitrary state. When the failure is over, processes resume their execution from this state. In transition systems, it is thus easily modelled by making all global states of the system initial.
Since an arbitrary state of the system will in general not satisfy strict safety requirements, in self-stabilisation one usually requires that a specification will eventually be satisfied, i.e., after a (either fixed or unknown) stabilisation time.
Byzantine Failures. Byzantine failure is a model of permanent failure where some processes do not execute the protocol, but are under the control of a Byzantine adversary. Our assumptions on the adversary are:
-
non-adaptiveness: the adversary picks the set of faulty nodes before the algorithm is run;
-
full information: the adversary can read the global state of the system;
-
computational power: the adversary has unbounded computational power.
In our setting, the non-adaptiveness does not remove any power from the adversary [11].Footnote 2 Therefore, it is equivalent to the strong Byzantine adversary, which subsumes most models of permanent failure. We will consider systems with a fixed upper bound \({f}\) on the number of Byzantine failures.
2.3 Formal Specifications
We consider formal specifications in linear-time temporal logic (LTL), where the atomic propositions are the system variables. A formula that uses only the input and output variables of a tuple \(\overline{v} = (v_1,\ldots ,v_k)\) of nodes will sometimes be written \(\varphi (\overline{v})\). We assume that the body of our specification is of the form
for some \(k \le \left| \,{{V}}\, \right| \).
Example 1
Consider a fully connected network of a set of nodes \({V}\). Suppose every process \(v\in {V}\) has a binary output variable \(c_v\). In the 2-counting problem from Sect. 1, every node \(v\) has an output \(c_v\), and the formal specification \(\varphi \) is the conjunction
stating that (for every node) the binary output should be flipped in every step, and (for all pairs of nodes) the output of two nodes should always be the same.
Fault-Tolerant Specifications. Since we consider systems that exhibit both self-stabilisation and Byzantine failures, we need to consider a special type of specifications:
-
self-stabilisation implies that specifications \(\varphi \) with non-trivial safety requirements (like in Example 1) in general cannot be satisfied without explicitly allowing a stabilisation time. Therefore, we consider specifications \(\varphi \) that are either of the form \({{\mathrm{\mathsf {F}}}}\psi \) (if we allow an unspecified stabilisation time), or of the form \({{\mathrm{\mathsf {X}}}}^{{t}} \psi \) (if we require that the stabilisation time is bounded by \({t}\) steps).
-
Byzantine failures imply that the respective nodes can behave arbitrarily, and properties of the specification can not be expected to hold for them. Therefore, we require that for every choice of the Byzantine nodes, the specification holds only for tuples of correct nodes, i.e., where none of the nodes is Byzantine. Formally, this means that instead of the original specification \(\forall \ \overline{v} \in V^k: \varphi (\overline{v})\) we consider the specification
$$\begin{aligned} \forall \ \overline{b} \in V^{{f}}, \overline{v} \in V^k: \left[ \left( \bigwedge _{1 \le i \le k, 1 \le j \le {f}} v_i \ne b_j \right) \rightarrow \varphi (\overline{v}) \right] . \end{aligned}$$(1)
Example 2
Recall the second part of the 2-counting specification:
For systems with one Byzantine node b in \({V}\), this property is modified to:
3 Bounded Synthesis of Resilient Systems
Synthesising distributed systems is in general undecidable [23, 44, 48]—with or without failures—and only becomes decidable by bounding the size of the implementation. The bounded synthesis problem consists in constructing an implementation that satisfies a given temporal logic specification and a bound on the number of states.
Finkbeiner and Schewe [24] gave an algorithm for bounded synthesis based on an encoding into satisfiability modulo theories (SMT). Inspired by their encoding, we describe in the following an algorithm for the bounded synthesis of distributed systems with Byzantine-tolerant self-stabilisation. The high-level structure of the approach is depicted in Fig. 2.
Input: Specification and Communication Graph. The input to our synthesis problem is a communication graph \(C= ({V},{X},\mathcal {I},\mathcal {O})\), possibly with a symmetry constraint \({V}_1 \dot{\cup }\ldots \dot{\cup }{V}_m = {V}\), and a (global) temporal specification \(\varphi \) over atomic propositions in \({X}\). In the following, let \(O = \bigcup _{v\in {V}} \mathcal {O}(v)\) be the set of global output variables (controlled by the system), and \(I = {X}\setminus O\) the set of global input variables (controlled by the environment) of \(C\).
In the following we encode the existence of local implementations of the nodes in \({V}\) such that the composition of these implementations in \(C\) satisfies \(\varphi \).
Conversion of the Specification to an Automata. Using the approach of Kupferman and Vardi [37], the specification \(\varphi \) is translated into a universal co-Büchi tree automaton (UCT) \(\mathcal {A}_{\varphi }=(Q,q_o,\delta ,F)\), where Q is a finite set of states, \(q_0 \in Q\) the initial state, \(\delta : Q \times 2^O \rightarrow \mathcal {P}(Q \times 2^I)\) a transition relation, and \(F \subseteq Q\) a set of rejecting states. A given UCT \(\mathcal {A}\) accepts an \((2^O,2^I)\)-LTS \(\mathcal {T}\) if no run in the parallel execution of \(\mathcal {A}\) and \(\mathcal {T}\) visits a rejecting state infinitely often. The UCT \(\mathcal {A}_{\varphi }\) is constructed such that it accepts an LTS \(\mathcal {T}\) if and only if \(\mathcal {T}\models \varphi \).
As an optimisation, we use a Safra-less conversion to generalised Rabin automata [50] rather than converting to a co-Büchi automata, as the automatically-generated Rabin automata are smaller. However, co-Büchi and Rabin automata are known to be equally expressive.
Uninterpreted LTS Based on Size Bound and Communication Graph. Let \({s}\) be a local size bound for implementations of nodes in \(C\). Then, for each node \(v_i \in \{v_1,\ldots ,v_n\}={V}\), we want to find a \((2^{\mathcal {O}(v_i)}, 2^{\mathcal {I}(v_i)})\)-LTS \(\mathcal L_i = (L_i,L_{0,i},\tau _i,o_i)\), with:
-
a set of (local) states \(L_i\) with \(\left| \,{L_i}\, \right| = {s}\);
-
a set of initial states \(L_{0,i} \subseteq L_i\);
-
a transition function \(\tau _i: L_i \times 2^{\mathcal {I}(v_i)} \rightarrow L_i\);
-
a labelling function \(o: L_i \rightarrow 2^{\mathcal {O}(v_i)}\).
\(L_i\) can be considered a fixed set of elements, while the transition and labelling functions are to be synthesised. That is, in our SMT encoding they are considered as uninterpreted functions. If \(v_i,v_j \in {V}_k\) for some \({V}_k\) in the symmetry constraint, then we introduce just one uninterpreted function symbol that is used for both \(\tau _i\) and \(\tau _j\), and similarly for \(o_i\) and \(o_j\). This enforces the constraint that both nodes will have the same implementation. The choice of \(L_{0_i}\) is explained below.
Encoding of UCT and LTS into SMT Problem. To encode the synthesis problem, we follow the approach of Finkbeiner and Schewe [24]. Let \(\mathcal G= (G, G_0, \tau , o)\) be the composition of the local implementations \(\mathcal L_i\) in \(C\). Then, we define an (uninterpreted) annotation function \(\lambda : Q \times G\rightarrow \mathbb {Q}\cup \{\bot \}\) that maps states in the product automaton \(\mathcal {A}_{\varphi } \times \mathcal G\) to either \(\bot \) or a rational number. To ensure that \(\mathcal G\) is accepted by \(\mathcal {A}_{\varphi }\), we introduce constraints on \(\lambda \) such that \(\lambda \) tracks whether states in the product automaton \(\mathcal {A}_{\varphi } \times \mathcal G\) are reachable, and bounds the number of visits to rejecting states in runs of \(\mathcal {A}_{\varphi } \times \mathcal G\). In particular, we require that
The conjunction (2) \(\wedge \) (3) \(\wedge \) (4), in the following denoted as \({\mathsf {SMT}}_\varphi \), encodes the existence of an implementation that satisfies \(\varphi \) in a system without failures.
Encoding Self-Stabilisation. We encode self-stabilisation by considering all states of the system as initial states, i.e., \(L_{0_i} = L_v\). Thus, a solution to our synthesis problem has to ensure that the specification \(\varphi \) is satisfied for all runs that begin in any of the states of the composed system. This corresponds directly to the definition of self-stabilisation introduced by Dijkstra [13].
Encoding Byzantine Failures. Byzantine nodes can behave arbitrarily, and the Byzantine adversary has information about the global state of the system [38]. Thus, the behaviour of Byzantine failures can be modelled by allowing them to give arbitrary outputs at any time, essentially re-assigning their outputs to outputs of the adversarial environment.
To encode this, we modify the constraints above such that Byzantine processes are observed with arbitrary output. In particular, if node \(v_i\) is Byzantine, then in (3) and (4) we replace each occurrence of a system variable \(x \in \mathcal {O}(v_i)\) with a fresh variable \(x_b\), and add a quantifier \(\forall x_b \in \mathbb {B}\). For a given formula \(\psi \) and \(\overline{x}(v)=\mathcal {O}(v)\), this substitution is denoted as \(\forall \overline{x_b}(v) \in \mathbb {B}.~ \psi [\overline{x}(v) \mapsto \overline{x_b}(v)]\).
Example 3
Consider that the first component \(v_1\) is Byzantine, let \(\overline{x}=\mathcal {O}(v_1)\), \(g= (l_1,\ldots ,l_n)\) and \(\sigma = (\sigma _1,\ldots ,\sigma _n)\). Then, constraint (3) is modified to:
Finally, since the Byzantine adversary can choose a set of at most \({f}\) Byzantine nodes, we have to quantify over all possible choices of the adversary. Since the satisfaction of the specification can depend on the choice of processes, we have an important change to the encoding: our quantification has to reflect that the correctness argument, and therefore the witness function \(\lambda \), can depend on this choice, while the transition and labelling functions \(\tau , o\) must not depend on this choice. This results in the following encoding:
Note that with this encoding, all neighbours of a Byzantine process observe the same outputs. It is straightforward to have different observed outputs for different neighbours in our encoding, at the cost of introducing one quantified variable for the observation of each neighbour.Footnote 3
Furthermore, note that all quantification is over finite sets, so we can eliminate all quantifiers by Skolemising \(\lambda \) such that it is a function that depends on the choice of \(\{b_1,\ldots ,b_{{f}}\}\), and explicitly instantiating the universal quantifiers.
Correctness. For uninterpreted functions \(\tau , o, \lambda \), satisfiability of our encoding is equivalent to the existence of an LTS that satisfies the specification \(\varphi \). Moreover, satisfying valuations of \(\tau \) and o give us a solution to the synthesis problem, and the valuation of \(\lambda \) witnesses correctness of that solution.
Theorem 1
(Correctness for fixed bound). The constraint system (5) is satisfiable if and only if the specification is finite-state realisable in a self-stabilising system with \({f}\) Byzantine nodes in the given communication graph. A satisfying assignment of \(\tau \) and o represents a solution to the synthesis problem.
Up to the encoding of failures, our encoding is equivalent to that of Finkbeiner and Schewe, and correctness follows from Theorem 5 of [24]. Correctness of the encoding of self-stabilisation is straightforward, and correctness of the encoding of Byzantine failures follows from our elaborations above.
Increasing the Bound. By iterating bounded synthesis for increasing bounds, we obtain a semi-decision procedure for the synthesis problem.
Corollary 1
(General correctness and completeness). A semi-procedure that iterates bounded synthesis of resilient systems for increasing bounds will eventually find a finite-state implementation of \(\varphi \) if it exists.
Practical Applicability. Our encoding includes a large number of quantifiers, both universal and existential. Since we consider finite domains, they could all be explicitly instantiated, but experiments show that their full instantiation results in a combinatorial blowup that quickly makes the SMT formula intractable. For non-trivial examples, existing SMT solvers (such as Z3 and CVC3) were unable to solve the resulting problem instances.
Abstracting from the universal quantifiers inside the innermost existential quantifier (as these are treated rather efficiently by existing methods like incremental instantiation [29, 41] or in some cases simply full instantiation) and the concrete meaning of the function symbols, our synthesis problem is of the form
where Q(x, y, z) is an SMT formula and x, y, z are from finite domains. In the following, we introduce a new, incremental algorithm that performs well for instances of this problem produced by our encoding.
4 Incremental Synthesis Algorithm
In this section, we introduce an algorithm that allows us to solve synthesis problems more efficiently than a direct application of an SMT solver on the full encoding of the previous section. To this end, we extend the approach of Counter-Example-Guided Inductive Synthesis (CEGIS) [52][51, Chapt. 4] to handle finite model extraction for first-order formulae with two quantifier alternations (\(\exists \forall \exists \)). Like CEGIS, our algorithm is only guaranteed to terminate when the universal quantification is over a finite domain.
4.1 Previous Work
Solar-Lezama et al. introduced CEGIS [51, 52] in the context of template-based synthesis, but it is a general method for solving first-order problems of the form \(\exists x.\,\forall y.\,Q(x,y)\). It is complete and terminating if y belongs to a finite domain Y. CEGIS performs model extraction, which is crucial when doing synthesis.
In the following, we will use x, y, z as first-order variables and \(\hat{x}, \hat{y}, \hat{z}\) as concrete values for these variables. CEGIS proceeds by building a database of counterexamples \(\hat{y}_i\) for any candidate \(\hat{x}\) that it has encountered. In the worst case, CEGIS performs \(O\left( \left| Y\right| \right) \) SMT queries until it reaches a conclusion; it is especially efficient if every \(\hat{y}\) eliminates a large portion of the possible values for x.
The CEGIS algorithm is shown in Fig. 3. Formula \(\phi (x)\) acts as the database of counterexamples. The algorithm uses two incremental SMT solvers. In Line 3, it extracts candidates for x that work for all counterexamples in the database. In Line 5 it uses a new variable \(y_{n_1}\) to extract a new counterexample that rejects at least the last candidate \(\hat{x}\).
4.2 Extension to First-Order Model Extraction
The encoding of our synthesis problem is of the form
In the specific case of our encoding, described earlier, x was called \(\tau \) and ranges over uninterpreted functions over finite domains; y was called B and ranges over tuples of process identifiers from a finite domain and z was called \(\lambda \) and ranges over the rationals.
While we still keep a set of counterexamples \(\hat{y}_i\), candidate generation is now a little more intricate: we look for one \(\hat{x}\) and a \(\hat{z}_i\) for every \(\hat{y}_i\) in the database.
The algorithm is shown in Fig. 4. Here, \(y_1\dots {}y_n\) is (still) the database of counterexamples. The candidate extraction is again in Line 3, but is more intricate: it now extracts a candidate \(\hat{x}\) for x as well as a candidate \(\hat{z}_i\) for each counterexample \(\hat{y}_i\). In Line 5 we then look for a new counterexample \(\hat{y}_{n+1}\) that shows that the formula is false for \(\hat{x}\) and any of the \(\hat{z}_i\).
Note that, again, we can use two incremental SMT solvers. In the outer satisfiability call (Line 3), we only add conjunctive constraints to \(\phi \). In the inner satisfiability call (Line 5), we add conjunctive constraints to \(\bigwedge _{i=1}^n \lnot Q(\hat{x},y_{n+1}, \hat{z}_i)\) as long as \(\hat{x}\) does not change, and we reset the formula if \(\hat{x}\) does change.Footnote 4
Correctness Argument. Let us assume the algorithm returns \(\hat{x}\). At the point where it returns, it has concrete values \(\hat{z}_1\ldots \hat{z}_n\) such that there is no y that falsifies \(Q(\hat{x}, y, \hat{z}_i)\) for all i. This means that for any y, there is a \(\hat{z}_i\) such that \(Q(\hat{x}, y, \hat{z}_i)\) is satisfied: we indeed exhibited a valid model for the formula.
Conversely, let us assume the algorithm returns FALSE: this means that the formula \(\phi = \bigwedge _{i=1}^n{Q(x, \hat{y}_i, z_i)}\) is not satisfiable. Assuming our original formula was satisfiable, and given a model \(x, z(\cdot )\) that satisfies it, then \(x, z_i = z(\hat{y}_i)\) would be a model for \(\phi \): hence, our original formula is UNSAT.
Finally, termination of the algorithm follows from the fact that our domains are finite, which implies that every formula only has finitely many satisfying assignments, and every call to the inner SMT solver strengthens the formula \(\phi \) such that at least one satisfying assignment is removed.
4.3 Related Work
Our work is close to Janota et al. [31] which extends CEGIS to decide QBF formulas with arbitrary quantifier alternation. Janota et al. propose a recursive algorithm which uses a number of nested SMT calls linear in the number of quantifier alternations, whereas we need only two. Moreover, since candidate values are changed by subsequent SMT calls more often, the algorithm cannot efficiently use incremental solving.
Another modification of CEGIS that is close to ours is that of Koksal et al. [34]. At a high level (i.e., the level we chose for our description in this section), their approach is very similar to ours. The differences between the algorithms are in the encoding of synthesis problems, as well as in the specialised verification and synthesis algorithms that are part of the description in Koksal et al. [34]. We chose a higher level of description for the CEGIS algorithm in order to increase its re-usability in different settings.
Finally, another approach for the synthesis of reactive systems that uses incremental refinement of candidate models is lazy synthesis [22]. The difference to our approach is that lazy synthesis is not based on CEGIS and a direct encoding of correctness into SAT or SMT, but instead uses LTL model checking and an encoding of error traces into SMT to obtain and refine candidate models. Lazy synthesis does not consider systems with Byzantine failures, but could probably be extended to our setting by extending the LTL model checking to all possible choices of Byzantine nodes and all possible actions taken by the Byzantine adversary. Whether this would be efficient is an open question.
5 Extension to Networks of Unbounded Size
The synthesis method we have introduced thus far is restricted to systems with a fixed number of components. However, correctness in networks of arbitrary size is needed for scalability, as it is unfeasible to synthesise a new solution whenever new processes are introduced into the system. In this section, we show how to obtain process implementations that are correct in systems of arbitrary size, based on the idea of Parameterised Synthesis [30]: by combining a general correctness argument for a specific class of systems and specifications, we can synthesise systems that will be correct in networks of unbounded size by synthesising a solution that (i) satisfies the specification and (ii) belongs to the class of systems for which the correctness argument holds.
Parameterised Systems. Let \(\mathcal {C}\) be the set of all communication graphs. Then a parameterised communication graph is a function \(\varGamma : \mathbb {N}\rightarrow \mathcal {C}\), where we assume that every \(\varGamma (i)\) comes with a symmetry constraint that separates the nodes into a finite number of implementation classes (with identical interface types). A parameterised communication graph \( \varGamma \) is of order k if, for all \(n \in \mathbb {N}\), the number of implementation classes in \(\varGamma (n)\) is less or equal to k. Then, an implementation of a parameterised communication graph \(\varGamma \) of order k is a set of implementations \(\{ \mathcal {T}_1, \ldots , \mathcal {T}_k \}\) for its nodes, one for each implementation class.
Parameterised Specifications. In specifications of parameterised systems, the atomic propositions are the system variables, indexed by fixed component identifiers or identifier variables. An identifier variable \(i\) can be quantified globally in the form \(\forall i. \varphi \), or locally in the form \(\forall i:neighbour(x). \varphi \). In every given instance of the parameterised communication graph, this quantification is simply interpreted as a finite conjunction over all possible values for i.
Cutoffs for Parameterised Synthesis. A central notion of parameterised synthesis is the cutoff: an upper bound c on the number of nodes in a network that need to be considered, such that components that are correct in the network of size c are also correct in any network of a bigger size. Formally, \(c \in \mathbb {N}\) is a cutoff for a set of specifications \(\varPhi \) and a class of systems S if, for every \(\varphi \in \varPhi \) and every \((\varGamma , \{ \mathcal {T}_1, \ldots , \mathcal {T}_k \}) \in S\) (where \(\varGamma \) is of order k), it holds that
Thus, a cutoff enables parameterised synthesis if and only if we can guarantee that our solution belongs to the system class S. In principle, this idea can directly be applied to failure-resilient systems, but existing cutoff results [2, 3, 7, 19, 20, 27] usually do not take into account fault-tolerance.
Colourless Specifications. In distributed systems, there is a classical notion of (weakly) colourless tasks for terminating executions of a system. This includes many important properties of finite runs, such as consensus and k-set agreement. We extend this notion to infinite runs.
For a given global state \(g=(l_1,\ldots ,l_n)\) of a system \( \mathcal G\), a variant of \(g\) is a state \(g'\) that can be obtained from \(g\) by changing the local state \(l_i\) of one process i to another local state \(l_i' \in L_i\), such that \(o_i(l_i') = o_j(l_j)\) for some \(j \ne i\), or by a sequence of such changes.
Then, define a specification \(\varphi \) to be colourless if for every execution trace \(o(g_0) o(g_1) \ldots o(g_n) \ldots \) that satisfies \(\varphi \), and any variant \(g_{n}'\) of \(g_{n}\), the partial trace \(o(g_0) o(g_1) \ldots o(g_{n-1}) o(g_{n}')\) can be extended to a trace that satisfies \(\varphi \).
An example of a colourless specification is the m-counting specification from our motivating example. Note that colourlessness is a semantic property of a specification, and we do not supply a syntactic fragment of LTL that guarantees colourlessness.
Cutoffs for Colourless Specifications. We show how to extend an n-process system \(\mathcal G\) satisfying a colourless specification \(\varphi \) into an \((n+k)\)-process system, satisfying the same specification. We assume that the processes in \(\mathcal G\) are fully connected (i.e., in a clique) and that state labels are unique, i.e., the output of a process is sufficient to conclude its current internal state. Based on these assumptions, we show how to synthesise a system that can be considered a larger clique, where the additional processes can have a different implementation.
The additional processes will have an implementation \(\mathcal L'\) that is different from that of the original processes. \(\mathcal L'\) reads the current input from the environment and the outputs of processes in the original clique, uses this information (which by assumption lets us conclude their current internal state) as well as knowledge about the original implementation \(\mathcal L\) to anticipate their next transition, and moves to a state that has the same output as the majority of the next-states in the clique.
To ensure that this will result in a correct system even under up to \({f}\) Byzantine nodes, we need to enforce an \({f}\) -majority property in the original system: in every round, the output chosen by the largest number of correct nodes is picked by \({f}\) more nodes than the second largest one.Footnote 5 Then, even if the computation of \(\mathcal L'\) described above includes Byzantine nodes, its output will be equal to that of the majority of original implementations \(\mathcal L\), and therefore the colourless specification will still be satisfied.
If we are synthesising the original system, the \({f}\) -majority property can be directly encoded as an additional cardinality constraint over the outputs. This constraint preserves satisfiability of the synthesis constraints, even for a given state space.
To see this, assume that a given original system satisfies a colourless specification, but does not have the \({f}\) -majority property. Then we can transform it into a system \(\mathcal G'\) which simulates \(\mathcal G\). At each step, processes in \(\mathcal G'\) simulate \(\mathcal G\) for one step, and check whether \({f}\)-majority is achieved. If it is not, then we can (partially) determinise the given system to obtain \({f}\) -majority: for instance, nodes can be grouped by output value, and state and output value of some nodes can be replaced with ones from the largest group. The modified system still produces valid runs thanks to the specification being colourless.
Theorem 2
If \(\varphi \) is a colourless specification, \(C\) is a fully informed clique and \(\left( \mathcal L_v\right) _{v\in {V}}\) a set of implementations such that their composition \(\mathcal G\) in \(C\) has the \({f}\)-majority property and \(\mathcal G\,\models \, \varphi \), then any extension of \(\mathcal G\) with additional processes \(\mathcal L'\) as described above will satisfy \(\varphi \).
Cutoffs for Local Specifications in Regular Networks. We can also obtain cutoffs for the setting that satisfies the following:
-
the networks has a constant-degree – also called regular – where all nodes have the same interface and implementation,
-
local specifications: specifications of the form \(\forall i.~{{\mathrm{\mathsf {X}}}}^{{t}} {{\mathrm{\mathsf {G}}}}\phi (i)\) (where \(\phi (i)\) is a Boolean formula over the current state of processes in a maximal distance of r to a process i),
-
a fixed number of Byzantine nodes \({f}\) in a distance of r around any node, and
-
a fixed stabilisation time \({t}\).
Theorem 3
Let \(C\) be a constant-degree network with a given interface for all nodes, and such that all nodes have a maximal distance of \(r+{t}\) from a central node \(v\). If an implementation \(\mathcal L\) satisfies a local specification \(\varphi (v)\) in \(C\), then \(\mathcal L\) satisfies \(\forall i.~\varphi (i)\) in any \(C'\) with the same degree, the same interface, and a radius greater than \(r+{t}\).
The cutoff follows from the fact that our specifications only require that we enter the “legitimate states” specified by \(\phi (i)\) within \({t}\) steps, and never leave them afterwards, and within these \({t}\) steps only information from nodes with this distance can enter the radius around i that \(\phi (i)\) talks about. Because of full symmetry in these systems, it is sufficient to require \(\varphi (v)\) instead of \(\forall i.~\varphi (i)\).
Specifications that can be expressed as purely first-order formulae can be rewritten as local specifications [9, 49]. This suggests that local formulae are expressive enough to be of interest: for instance, consensus is local despite k-set agreement being non-local.Footnote 6
6 Experimental Results
A preliminary implementation was written in OCaml, using Sickert’s formally proven correct tool [50] to convert LTL specifications to automata, and de Moura and Bjørner’s Z3 [42] as the backend SMT solver.
Experiments were run on a number of computers equipped with 64GiB of memory and eight cores clocked at 2.6 GHz. Note that our solver is sequential and does not take advantage of multicore machines.
We were able to reproduce the results from Dolev et al. [15] regarding synchronous 2-counting with a single Byzantine adversary (\(f=1\)). Each experiment – for a fixed set of parameters – took at most one hour. As in previous results [15], those solutions can be extended to any system of greater size while keeping the number f of failures, the stabilisation time t and the local state size s constants.
Moreover, we were able to synthesise a symmetric solution for 4-counting, for 4 processes with 5 states each, and stabilisation time 10. This improves on the solution suggested by Dolev et al. to simply duplicate a 2-counter to obtain a 4-counter, which would result in an implementation with 6 local states and a stabilisation time of at least 14 in this case. To our knowledge, this is the first instance of a solution to n-counting (with \(n>2\)) ever synthesised. This result shows that our more general approach allows us to obtain even more efficient implementations than that of Dolev et al., without the need to manually devise a new encoding and argue about its correctness.
Attempts to replicate these results using directly a first-order model finder – such as CVC3 [5] – or existing extensions of CEGIS [31] resulted either in timeout (no result within 12h) or running out of memory.
Moreover, as mentioned in Sect. 3, we use a translation from LTL to Rabin automata [50]: we compared that approach to encoding universal co-Büchi automata obtained with ltl3ba [4] and observed a speedup from 25 % to 50 % depending on the instance.
7 Conclusion
We have presented a method to automatically synthesise distributed systems that are self-stabilising and resilient to Byzantine failures. We assume that the systems are specified in LTL. Our results apply to finite network graphs and extend to parameterised synthesis of processes on a graph of arbitrary size under reasonable conditions. The approach follows the basic idea of Bounded Synthesis. It constructs an SMT formula with two quantifier alternations that states that a fault-tolerant implementation of a given size exists, and it is complete if a bound on the size of the process implementation is given. We have presented a CEGIS-style decision procedure to decide such formulas that is far more efficient than existing approaches for the formulas we have encountered. Finally, we show that we can efficiently synthesise a small solution for the 2-counter problem.
In this work, we only consider the synthesis of basic building blocks of distributed systems, modelled as labelled transition systems. To obtain actual large-scale implementations, many additional layers of complexity need to be addressed, and in practice there will be a trade-off between formality and automation on the one hand, and scale or precision of the system model on the other hand, as for example demonstrated in recent work of Hawbitzel et al. [28].
In the near future, we plan to extend our approach to more general timing models and to study more general specifications for parameterised synthesis. In particular, we want to extend our approach to the system model of the PSync language of Dragoi et al. [18], which enables reasoning about asynchronous systems by introducing a notion of “communication rounds”, and will make our approach applicable to a much larger class of problems. Furthermore, we will look into optimisations of the encoding, as described by Khalimov et al. [32, 33] for parameterised synthesis of systems without fault-tolerance.
Notes
- 1.
Related to this are also approaches for the synthesis of robust systems [6], essentially modelling failures in the environment of a single process.
- 2.
Essentially, this is because our model is not probabilistic, and because the protocol must work for any choice of Byzantine nodes and any behaviour they can exhibit, which includes all possible behaviours of an adaptive adversary.
- 3.
Also, note that fail-stop failures can be seen as a special case of Byzantine failures, and can be modelled in a similar way: instead of giving arbitrary outputs, the chosen nodes at some point move into a special stop state, from which they cannot recover.
- 4.
In fact, in our prototype implementation we use a heuristic that avoids throwing away the formula by re-assigning the value of \(\hat{x}\) in the formula whenever the outer SMT call returns a new value. Then, we do not throw away the formula at all, but risk that it grows unnecessarily large. In our experiments, this has shown favourable effects.
- 5.
This is an extension of the argument for proving the exact number of Byzantine failures that can be survived while solving consensus problems [38].
- 6.
It is not sufficient to prove that k-set agreement is not expressible in FO. However, k-set agreement – for any given \(k\ge 2\) – can easily be proven non-local by contradiction.
References
Alur, R., Bodik, R., Juniwal, G., Martin, M.M., Raghothaman, M., Seshia, S., Singh, R., Solar-Lezama, A., Torlak, E., Udupa, A., et al.: Syntax-guided synthesis. In: Formal Methods in Computer-Aided Design (FMCAD), 2013, pp. 1–8. IEEE (2013)
Aminof, B., Jacobs, S., Khalimov, A., Rubin, S.: Parameterized model checking of token-passing systems. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 262–281. Springer, Heidelberg (2014). http://doai.io/10.1007/978-3-642-54013-4_15
Außerlechner, S., Jacobs, S., Khalimov, A.: Tight cutoffs for guarded protocols with fairness. In: Jobstmann, B., et al. (eds.) VMCAI 2016. LNCS, vol. 9583, pp. 476–494. Springer, Heidelberg (2016). doi:10.1007/978-3-662-49122-5_23
Babiak, T., Křetínský, M., Řehák, V., Strejček, J.: LTL to Büchi automata translation: fast and more deterministic. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 95–109. Springer, Heidelberg (2012). http://doai.io/10.1007/978-3-642-28756-5_8
Barrett, C.W., Tinelli, C.: CVC3. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 298–302. Springer, Heidelberg (2007)
Bloem, R., Chatterjee, K., Greimel, K., Henzinger, T., Hofferek, G., Jobstmann, B., Könighofer, B., Könighofer, R.: Synthesizing robust systems. Acta Informatica 51(3), 193–220 (2014). http://doai.io/10.1007/s00236-013-0191-5
Bloem, R., Jacobs, S., Khalimov, A., Konnov, I., Rubin, S., Veith, H., Widder, J.: Decidability of Parameterized Verification. Synthesis Lectures on Distributed Computing Theory. Morgan & Claypool Publishers, San Rafael (2015). http://doai.io/10.2200/S00658ED1V01Y201508DCT013
Bloem, R., Jobstmann, B., Piterman, N., Pnueli, A., Sa’ar, Y.: Synthesis of reactive(1) designs. J. Comput. Syst. Sci. 78(3), 911–938 (2012). http://doai.io/10.1016/j.jcss.2011.08.007
Bollig, B.: Logic for communicating automata with parameterized topology. In: CSL-LICS, pp. 18:1–18:10. ACM (2014)
Bradley, A.R.: SAT-based model checking without unrolling. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidelberg (2011). http://doai.io/10.1007/978-3-642-18275-4_7
Canetti, R., Damgrd, I., Dziembowski, S., Ishai, Y., Malkin, T.: On adaptive vs. non-adaptive security of multiparty protocols. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 262–279. Springer, Heidelberg (2001)
Černý, P., Henzinger, T.A., Radhakrishna, A., Ryzhyk, L., Tarrach, T.: Efficient synthesis for concurrency by semantics-preserving transformations. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 951–967. Springer, Heidelberg (2013). http://doai.io/10.1007/978-3-642-39799-8_68
Dijkstra, E.W.: Self-stabilizing systems in spite of distributed control. Commun. ACM 17(11), 643–644 (1974)
Dimitrova, R., Finkbeiner, B.: Synthesis of fault-tolerant distributed systems. In: Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 321–336. Springer, Heidelberg (2009). http://doai.io/10.1007/978-3-642-04761-9_24
Dolev, D., Korhonen, J.H., Lenzen, C., Rybicki, J., Suomela, J.: Synchronous counting and computational algorithm design. In: Higashino, T., Katayama, Y., Masuzawa, T., Potop-Butucaru, M., Yamashita, M. (eds.) SSS 2013. LNCS, vol. 8255, pp. 237–250. Springer, Heidelberg (2013). http://doai.io/10.1007/978-3-319-03089-0_17
Dolev, S.: Self-Stabilization. MIT Press, Cambridge (2000)
Dolev, S., Welch, J.L.: Self-stabilizing clock synchronization in the presence of Byzantine faults. J. ACM (JACM) 51(5), 780–799 (2004)
Dragoi, C., Henzinger, T.A., Zufferey, D.: PSync: a partially synchronous language for fault-tolerant distributed algorithms. In: POPL, pp. 400–415. ACM (2016). http://doai.io/10.1145/2837614.2837650
Emerson, E.A., Namjoshi, K.S.: On reasoning about rings. Int. J. Found. Comput. Sci. 14(4), 527–550 (2003). http://doai.io/10.1142/S0129054103001881
Emerson, E., Kahlon, V.: Reducing model checking of the many to the few. In: McAllester, D. (ed.) Automated Deduction - CADE-17. LNCS, vol. 1831, pp. 236–254. Springer, Berlin Heidelberg (2000)
Faghih, F., Bonakdarpour, B.: SMT-based synthesis of distributed self-stabilizing systems. TAAS 10(3), 21 (2015). http://doai.io/10.1145/2767133
Finkbeiner, B., Jacobs, S.: Lazy synthesis. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 219–234. Springer, Heidelberg (2012)
Finkbeiner, B., Schewe, S.: Uniform distributed synthesis. In: (LICS 2005), pp. 321–330. IEEE Computer Society (2005). http://doai.io/10.1109/LICS.2005.53
Finkbeiner, B., Schewe, S.: Bounded synthesis. STTT 15(5–6), 519–539 (2013). http://doai.io/10.1007/s10009-012-0228-z
Fischer, M.J., Lynch, N.A., Paterson, M.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985). http://doai.io/10.1145/3149.214121
Ganzinger, H., Hagen, G., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: DPLL(T): fast decision procedures. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 175–188. Springer, Heidelberg (2004). http://doai.io/10.1007/978-3-540-27813-9_14
German, S.M., Sistla, A.P.: Reasoning about systems with many processes. J. ACM 39(3), 675–735 (1992)
Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts, M.L., Setty, S.T.V., Zill, B.: Ironfleet: proving practical distributed systems correct. In: SOSP, pp. 1–17. ACM (2015)
Jacobs, S.: Incremental instance generation in local reasoning. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 368–382. Springer, Heidelberg (2009)
Jacobs, S., Bloem, R.: Parameterized synthesis. Log. Methods Comput. Sci. 10, 1–29 (2014). http://arxiv.org/abs/1401.3588
Janota, M., Klieber, W., Marques-Silva, J., Clarke, E.: Solving QBF with counterexample guided refinement. In: Cimatti, A., Sebastiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 114–128. Springer, Heidelberg (2012)
Khalimov, A., Jacobs, S., Bloem, R.: PARTY parameterized synthesis of token rings. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 928–933. Springer, Heidelberg (2013)
Khalimov, A., Jacobs, S., Bloem, R.: Towards efficient parameterized synthesis. In: Giacobazzi, R., Berdine, J., Mastroeni, I. (eds.) VMCAI 2013. LNCS, vol. 7737, pp. 108–127. Springer, Heidelberg (2013)
Köksal, A.S., Pu, Y., Srivastava, S., Bodík, R., Fisher, J., Piterman, N.: Synthesis of biological models from mutation experiments. In: POPL, pp. 469–482. ACM (2013)
Konnov, I., Veith, H., Widder, J.: SMT and POR beat counter abstraction: parameterized model checking of threshold-based distributed algorithms. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 85–102. Springer, Heidelberg (2015). http://doai.io/10.1007/978-3-319-21690-4_6
Kuncak, V., Mayer, M., Piskac, R., Suter, P.: Functional synthesis for linear arithmetic and sets. STTT 15(5–6), 455–474 (2013). http://doai.io/10.1007/s10009-011-0217-7
Kupferman, O., Vardi, M.Y.: Safraless decision procedures. In: FOCS 2005, pp. 531–542. IEEE Computer Society (2005). http://doai.io/10.1109/SFCS.2005.66
Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982). http://doai.io/10.1145/357172.357176
Lamport, L.: Brief announcement: leaderless Byzantine paxos. In: Peleg, D. (ed.) Distributed Computing. LNCS, vol. 6950, pp. 141–142. Springer, Heidelberg (2011)
McMillan, K.L.: Applying SAT methods in unbounded symbolic model checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 250–264. Springer, Heidelberg (2002). http://doai.io/10.1007/3-540-45657-0_19
de Moura, L., Bjørner, N.S.: Efficient e-matching for SMT solvers. In: Pfenning, F. (ed.) CADE 2007. LNCS (LNAI), vol. 4603, pp. 183–198. Springer, Heidelberg (2007)
de Moura, L., Bjørner, N.S.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). http://doai.io/10.1007/978-3-540-78800-3_24
Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. J. ACM 27(2), 228–234 (1980)
Pnueli, A., Rosner, R.: Distributed reactive systems are hard to synthesize. In: 2013 IEEE 54th Annual Symposium on Foundations of Computer Science 1990, vol. 2, pp. 746–757 (1990)
Qadir, J., Hasan, O.: Applying formal methods to networking: Theory, techniques and applications. CoRR abs/1311.4303 (2013). http://arxiv.org/abs/1311.4303
Saissi, H., Bokor, P., Muftuoglu, C.A., Suri, N., Serafini, M.: Efficient verification of distributed protocols using stateful model checking. In: SRDS, pp. 133–142. IEEE (2013). http://doai.io/10.1109/SRDS.2013.22
Saks, M.E., Zaharoglou, F.: Wait-free k-set agreement is impossible: the topology of public knowledge. SIAM J. Comput. 29(5), 1449–1483 (2000). http://doai.io/10.1137/S0097539796307698
Schewe, S.: Distributed synthesis is simply undecidable. Inf. Process. Lett. 114(4), 203–207 (2014). http://doai.io/10.1016/j.ipl.2013.11.012
Schwentick, T., Barthelmann, K.: Local norms forms for first-order logic with applications to games and automata. In: Meinel, C., Morvan, M. (eds.) STACS 1998. LNCS, vol. 1373, pp. 444–454. Springer, Heidelberg (1998)
Sickert, S.: Converting linear temporal logic to deterministic (generalised) rabin automata. Archive of Formal Proofs 2015 (2015)
Solar Lezama, A.: Program synthesis by sketching. Ph.D. thesis, EECS Department, University of California, Berkeley (2008). http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-177.html
Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs. In: ASPLOS 2006, pp. 404–415. ACM (2006). http://doai.io/10.1145/1168857.1168907
Tixeuil, S.: Self-stabilizing algorithms. In: Algorithms and Theory of Computation Handbook. Applied Algorithms and Data Structures, 2nd edn, pp. 26.1–26.45. Chapman & Hall/CRC, CRC Press, Taylor & Francis Group (2009)
Vechev, M., Yahav, E., Yorsh, G.: Inferring synchronization under limited observability. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 139–154. Springer, Heidelberg (2009). http://doai.io/10.1007/978-3-642-00768-2_13
Acknowledgements
We thank Igor Konnov, Ulrich Schmid, Josef Widder, and the late Helmut Veith for interesting discussions on formal methods for distributed systems. We also thank the anonymous reviewers for their detailed and insightful comments.
This work was supported by the Austrian Science Fund (FWF) through projects LogiCS (W1255-N23), QUAINT (I774-N23) and RiSE (S11406-N23), and by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Center AVACS (SFB/TR 14) and through project ASDPS (JA 2357/2-1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Bloem, R., Braud-Santoni, N., Jacobs, S. (2016). Synthesis of Self-Stabilising and Byzantine-Resilient Distributed Systems. In: Chaudhuri, S., Farzan, A. (eds) Computer Aided Verification. CAV 2016. Lecture Notes in Computer Science(), vol 9779. Springer, Cham. https://doi.org/10.1007/978-3-319-41528-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-41528-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41527-7
Online ISBN: 978-3-319-41528-4
eBook Packages: Computer ScienceComputer Science (R0)