figure a
figure b

1 Introduction

Due to the increasing prevalence and importance of distributed systems in our society, ensuring reliability and correctness of these systems has become paramount. Computer-aided verification of distributed protocols and algorithms has been a very active research area in recent years [35, 37, 49, 52, 60]. To be practical, models of distributed algorithms need to take into account that communication or processes may be faulty, and correctness guarantees should be given based on a resilience condition that defines the quantity and quality of faults (e.g., how many processes may crash or even behave arbitrarily [13, 45, 48]).

Moreover, many distributed systems consist of an arbitrary number of communicating processes, thus requiring parameterized verification techniques that consider the number of processes as a parameter of the system, and that can provide correctness guarantees regardless of this parameter. However, the parameterized verification problem is in general undecidable, even in restricted settings such as identical and anonymous finite-state processes that communicate by passing a binary-valued token in a ring [58].

Since undecidability arises so easily in this setting, the research into automatic parameterized verification can in principle be divided into two directions: (i) identifying decidable classes of systems and properties [1, 3,4,5,6,7, 15, 24, 26,27,28,29,30, 32, 33, 36, 38, 41, 54], which are often restricted to rather specialized use cases, and (ii) developing semi-decision procedures [2, 16], which are usually much more versatile but come without a termination guarantee. In practice however, the line between these two approaches is not so clear. The line blurs when semi-decision procedures serve as decision procedures for a certain fragment of their possible inputs, and on the other hand, many decidable problems in parameterized verification have a huge complexity [24, 54], and to an engineer it does not make a difference if the verification algorithm is guaranteed to terminate within a thousand years, or whether it comes without a termination guarantee.

In this paper, we aim for practical verification techniques that cover a large class of fault-tolerant distributed algorithms. Our focus is not decidability, but the development of techniques that can in practice verify a wide range of distributed algorithms, with full automation. We build on the formal model of threshold automata (TA) and existing techniques to verify them, which we combine with insights from the theory of well-structured transition systems (WSTS) [32] and with abstraction techniques. This combination allows us to lift some of the restrictions of the existing decidable fragments for TAs, and to verify distributed algorithms that are not supported by the existing techniques.

In particular, existing verification techniques for TAs are usually restricted to reason about shared variables that are monotonically increasing or decreasing over any run of the automaton. An extension of the model that allows both increasing and decreasing variables has been considered before [47], but only to show that the problem becomes undecidable, not to develop a verification technique for this case. In our setting, we can allow increments, decrements and resets of shared variables, while still obtaining a semi-decision procedure, and even a decision procedure in certain cases. In this paper, we demonstrate how this allows us to reason about round-based algorithms such as the phase king consensus algorithm [11] and the Red Belly Blockchain algorithm [23], which both use resets of shared variables at the beginning of a round, without a fixed bound on the number of rounds to be executed.

Contributions. In this paper,

  1. 1.

    we present an extension of threshold automata that allows increments, decrements and resets of shared variables (Sect. 2),

  2. 2.

    we develop a technique (based on well-structured transition systems) that is a decision procedure for general coverability properties of canonical threshold automata [47], and is a semi-decision procedure for our extension (Sect. 35),

  3. 3.

    we develop an additional abstraction that reduces the search space of this technique and additionally allows us to check another type of specification, called reachability properties, on extended threshold automata (Sect. 6),

  4. 4.

    we implement our techniques and demonstrate their performance on a number of examples from the literature, several of which—including the phase king consensus algorithm and the state-of-the-art Red Belly Blockchain algorithm—cannot be modeled in canonical threshold automata (Sect. 7).

2 System Model

In this section, we build on the existing notion of threshold automata (TAs) [45] and generalize their notion of shared variables. TAs are a model of distributed computation that encodes information exchange between processes into a fixed set of shared variables. Shared variables in TAs are usually required to be monotonically increasing along an execution to ensure decidability. We extend the definition such that it permits shared variables to be decreased or reset, which in general introduces undecidability [47]. We then define the semantics of an unbounded number of such TAs running in parallel. In Sect. 2.1, we introduce the notion of an abstract TA and provide the semantics for this abstraction.

Definition 1

A threshold automaton is a tuple \(A=(L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)\) where:

  • L is a finite set of locations.

  • \( \mathcal {I}\subseteq L\) is the set of initial locations.

  • \(\varGamma = \{x_0,\ldots ,x_m\}\) is a finite set of shared variables over \(\mathbb {N}_0\).

  • \(\varPi \) is a finite set of parameter variables over \(\mathbb {N}_0\). Usually, \(\varPi = \{n,t,f\}\), where n is the total number of processes, t is a bound on the number of tolerated faulty processes , and f is the actual number of faulty processes.

  • RC, the resilience condition, is a linear integer arithmetic formula over parameter variables. E.g.: \(RC = n > 3t\, \wedge \, t \ge f\). For a vector \(\vec{p}\in \mathbb {N}_0^{|\varPi |}\), we write \(\vec{p}\models RC\) if RC holds after substituting parameter variables with values according to \(\vec{p}\). Then the set of admissible parameters is \(\vec{P}_{RC} = \{\vec{p}\in \mathbb {N}_0^{|\varPi |} : \vec{p}\models RC\}\).

  • \(\mathcal {R}\) is a set of rules where a rule is a tuple \(r=(\textit{from}, \rightarrow , \varphi ,\vec{uv},\tau )\) such that:

    • \(\textit{from},\rightarrow \in L\).

    • \(\vec{uv}\in |\mathbb {Z}|^{|\varGamma |}\) is an update vector for shared variables.

    • \(\varphi \) is a conjunction of lower guards and upper guards. A lower guard has the form: \(a_0 + \sum _{i=1}^{|\varPi |} a_i \cdot p_i \le x\); An upper guard has the form: \( a_0 + \sum _{i=1}^{|\varPi |} a_i \cdot p_i > x\), with \(x \in \varGamma \), \(a_0,\ldots ,a_{|\varPi |} \in \mathbb {Q}\), \(p_1,\ldots ,p_{|\varPi |} \in \varPi \). The left-hand side of a lower or upper guard is called a threshold.

    • \(\tau \subseteq \varGamma \) is the set of shared variables to be reset to 0.

Fig. 1.
figure 1

A threshold automaton for simple voting.

Example 1

Figure 1 sketches a threshold automaton with \(\mathcal {I}=\{v_0,v_1\}\), \(L = \{v_0,v_1, Wait, d_0,d_1\}\), \(\varGamma =\{x_0,x_1\}\), \(\varPi =\{n,t,f\}\). A process in \(v_0\) has a vote of 0 and a process in \(v_1\) has a vote of 1. If at least \(n-t\) processes vote with 0 (respectively, 1), the decision will be 0 (1), modeled by all processes moving to \(d_0\) (\(d_1\)).

Semantics of TA. Given a TA \(A = (L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)\), let the function \(N: \, \vec{P}_{RC} \rightarrow \mathbb {N}_0\) determine the number of processes to be modelled (usually, \(N(n,t,f) = n-f\)). Then, the concrete semantics of a system composed of \(N(\vec{p})\) threshold automata running in parallel are defined via a counter system.

Definition 2

A counter system (\(\textsf{CS}\)) of a TA \(A = (L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)\) is a transition system \(\mathsf {CS(A)}= (\varSigma , \varSigma _0, \mathcal {T})\) where

  • \(\varSigma \) is the set of configurations. A configuration is a tuple \(\sigma =(\vec{k},\vec{g},\vec{p})\) where:

    • \(\vec{k}\in \mathbb {N}_0^{|L|}\) is a vector of counter values, where \(\vec{k}[i]\) represents the number of processes in location i. We refer to locations by their indices in L.

    • \(\vec{g}\in \mathbb {N}_0^{|\varGamma |}\) is a vector of shared variables’ values, where \(\vec{g}[i]\) is the value of variable \(x_i \in \varGamma \).

    • \(\vec{p}\in \vec{P}_{RC}\) is an admissible vector of parameter values.

  • The set \(\varSigma _0\) contains all initial configurations, i.e., configurations that satisfy

    $$\begin{aligned} \forall x_i \in \varGamma : \,\, \sigma .\vec{g}[i] = 0 \text { and } \sum _{i \in \mathcal {I}} \sigma .\vec{k}[i] = N(\vec{p}) \text { and } \sum _{i \not \in \mathcal {I}} \sigma .\vec{k}[i] = 0 \end{aligned}$$
  • \(\mathcal {T}\subseteq \varSigma \times \mathcal {R}\times \varSigma \) is the set of transitions, where \((\sigma , r, \sigma ') \in \mathcal {T}\) if and only if all of the following conditions hold:

    • \(\sigma '.\vec{p}= \sigma .\vec{p}\) (parameter values never change).

    • \(\sigma '.\vec{k}[r.\rightarrow ] = \sigma .\vec{k}[r.\rightarrow ] + 1\) (one process moves to \(r.\rightarrow \)).

    • \(\sigma '.\vec{k}[r.\textit{from}] = \sigma .\vec{k}[r.\textit{from}] - 1\) (one process moves out of \(r.\textit{from}\))

    • \(\sigma .\vec{g}\models r.\varphi \) (i.e., \(\varphi \) holds after replacing shared variables with values \(\sigma .\vec{g}\))

    • \(\sigma '.\vec{g}= \sigma .\vec{g}+ r.\vec{uv}\)

    • \(\forall x_i \in \tau \) \(\sigma '.\vec{g}[i] = 0\)

    Instead of \((\sigma , r, \sigma ') \in \mathcal {T}\) we also write \(\sigma \xrightarrow {r} \sigma '\). If \((\sigma , r, \sigma ') \in \mathcal {T}\), we say r is enabled in \(\sigma \); otherwise it is disabled.

Paths of \(\textsf{CS}\). A sequence \(\sigma _0, r_0, \sigma _1,\ldots ,\sigma _{k-1},r_{k-1}, \sigma _k\) of alternating configurations and rules is a path of a counter system \(\mathsf {CS(A)}= (\varSigma , \varSigma _0, \mathcal {T})\) if and only if \(\sigma _0 \in \varSigma _0\) and \((\sigma _{i},r_i,\sigma _{i+1}) \in \mathcal {T}\) for \(0 \le i < k\). In this case we also write \(\sigma _0 \rightarrow ^* \sigma _k\). We denote by \(Paths(\mathsf {CS(A)})\) the set of all paths of \(\mathsf {CS(A)}\).

Example 2

Let \(N(n,t,f) = n-f\), \(RC = n > 3t \wedge t \ge f\), then the following is a valid path of the counter system of the TA in Fig. 1:

\([(4,0,0,0,0)(0,0)],r_0,[(3,0,1,0,0)(1,0)],r_0,[(2,0,2,0,0)(2,0)],r_0\),

\([(1,0,3,0,0)(3,0)],r_0,[(0,0,4,0,0)(4,0)],r_2,[(0,0,3,1,0)(4,0)]\).

2.1 Abstract Threshold Automata

The shared variables and parameters of a TA have infinite domains. To facilitate the application of parameterized verification techniques for finite-state processes, we introduce an abstraction of TAs based on parametric interval abstraction [40].

Abstract Domain. Given a TA A, define as \(\mathcal{T}\mathcal{H}=\{d_0,d_1,\ldots ,d_k\}\) the set of thresholds where \(d_0=0,d_1=1\) and \(\forall i >1\) \(d_i\) is a threshold in A. We assume that \(\forall i,j\) \(d_i < d_j\) if \(i < j\). Note that this is always possible for a fixed \(\vec{p}\in \vec{P}_{RC}\). If different \(\vec{p}\in \vec{P}_{RC}\) result in different orders of the \(d_i\), then we consider each of the finitely many such orders separately. Based on this, define the finite set of intervals \(\mathcal {D}=\{I_0,I_1,\ldots ,I_k\}\) where \(I_i = [d_i,d_{i+1}[\) if \(i<k\), and \(I_k = [d_k,\infty [\).

Definition 3

Abstract Threshold Automata. Given a threshold automaton \(A=(L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)\), we define the abstract threshold automaton (or \(\overline{\text {TA}}\)) \(\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})\) where:

  • A and \(\overline{A}\) share the components \(L,\mathcal {I},\varPi \).

  • Let \(\varGamma = \{x_0,\ldots ,x_m\}\), then \(\overline{\varGamma }= \{\overline{x}_0,\ldots ,\overline{x}_m\}\), where each \(\overline{x}_i\) is over the domain \(\mathcal {D}=\{I_0,I_1,\ldots ,I_k\}\).

  • \(\overline{\mathcal {R}}\) is the set of abstract rules. An abstract rule is a tuple \(\overline{r}=(\textit{from}, \rightarrow , \overline{\varphi }, \vec{uv},\) \( \tau )\) where \(\textit{from}, \rightarrow , \vec{uv}, \tau \) are as before, and the abstract guard \(\overline{\varphi }\) is a Boolean expression over equalities between shared variables and abstract values. Formally, let \(\varphi = \varphi _0 \wedge \ldots \wedge \varphi _k\), then \(\overline{\varphi }= \overline{\varphi }_0 \wedge \ldots \wedge \overline{\varphi }_k\) where for \(\varphi _i = (d_j \le x)\), we have \(\overline{\varphi }_i = \bigvee _{c=j}^{k-1} (\overline{x}= [d_c,d_{c+1}[) \vee \overline{x}= [d_k,\infty [\), and for \(\varphi _i = (d_j > x)\), we have \(\overline{\varphi }_i = \bigvee _{c=0}^{j-1} (\overline{x}= [d_c,d_{c+1}[)\).

Example 3

Consider again the TA in Fig. 1 with \(N(n,t,f) = n-f\), \(RC = n > 3t \wedge t \ge f > 1\). We have \(\mathcal{T}\mathcal{H}=\{0,1,t,n-t\}\) and \(\mathcal {D}=\{[0,1[,[1,t[,[t,n-t[, [n-t,\infty [\}\) whose order is induced by RC. Moreover, we have \(\overline{r}_0 = r_0,\overline{r}_1 = r_1\) (due to the absence of a guard), \(\overline{r}_2.\overline{\varphi }= (\overline{x}_0 = [n-t,\infty [)\), \(\overline{r}_3.\overline{\varphi }= (\overline{x}_1 = [n-t,\infty [)\).

To keep the presentation simple, in our definition all shared variables have the same abstract domain. The abstraction can be improved by considering different abstract domains for different variables: for a given shared variable x we can let \(\mathcal{T}\mathcal{H}_x=\{d_0,d_1,\ldots ,d_l\}\) where \(d_0=0,d_1=1\) and \(\forall i >1\) there is a guard \(d_i * x\) with \(* \in \{\ge , <\}\), to obtain a corresponding abstract domain \(\mathcal {D}_x\) for x.

2.1.1 Semantics of \(\overline{\text {TA}}\).

We first over-approximate the semantics of a system composed of an arbitrary number of \(\overline{\text {TA}}\)s by an abstract counter system. We later show how to detect whether a behavior of the abstract counter system corresponds to a concrete behavior of a counter system.

Definition 4

An abstract counter system (\(\textsf{ACS}\)) of \(\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})\) is a transition system \(\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})\) where:

  • \(\overline{\varSigma }\) is the set of abstract configurations. A configuration of \(\mathsf {ACS(\overline{A})}\) is a tuple \(\overline{\sigma }=(\vec{k},\overline{\vec{g}})\) where:

    • \(\vec{k}\in \mathbb {N}_0^{|L|}\) is a vector of counter values where \(\vec{k}[i]\) represents the number of processes in location i.

    • \(\overline{\vec{g}}\in \mathcal {D}^{|\overline{\varGamma }|}\) is a vector of shared variables values, where \(\overline{\vec{g}}[i]\) is the parametric interval currently assigned to \(\overline{x}_i\).

  • The set \(\overline{\varSigma }_0\) contains all initial abstract configurations, i.e., those that satisfy

    $$\begin{aligned} \forall \overline{x}_i \in \overline{\varGamma }: \,\, \overline{\sigma }.\overline{\vec{g}}[i] = I_0 \text { and } \sum _{i \in \mathcal {I}} \overline{\sigma }.\vec{k}[i] \ge 0 \text { and } \sum _{i \not \in \mathcal {I}} \overline{\sigma }.\vec{k}[i] = 0 \end{aligned}$$
  • \(\overline{\mathcal {T}}\subseteq \overline{\varSigma }\times \overline{\mathcal {R}}\times \overline{\varSigma }\) is the set of transitions. A transition is a tuple \(t=(\overline{\sigma }, \overline{r}, \overline{\sigma }')\) where:

    • \(\overline{\sigma }'.\vec{k}[\overline{r}.\rightarrow ] = \overline{\sigma }.\vec{k}[\overline{r}.\rightarrow ] + 1\)

    • \(\overline{\sigma }'.\vec{k}[\overline{r}.\textit{from}] = \overline{\sigma }.\vec{k}[\overline{r}.\textit{from}] - 1\)

    • \(\overline{\sigma }.\overline{\vec{g}}\models \overline{r}.\overline{\varphi }\)

    • \(\overline{\sigma }'.\overline{\vec{g}}= \overline{\sigma }.\overline{\vec{g}}\dot{+} \overline{r}.\vec{uv}\), defined as follows: \(\forall i < |\varGamma | \):

      1. 1.

        \(\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i]\), if \(\overline{r}.\vec{uv}[i] = 0\)

      2. 2.

        \((\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i]) \vee (\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i].next)\), if \(\overline{r}.\vec{uv}[i] = 1\)

      3. 3.

        \((\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i]) \vee (\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i].previous)\), if \(\overline{r}.\vec{uv}[i] = -1\)

        the first disjunct in 2 and the second in 3 are omitted if \(\overline{\sigma }.\overline{\vec{g}}[i] = I_0\).

      4. 4.

        \(\forall x_i \in \overline{r}.\tau : \,\, \overline{\sigma }'.\overline{\vec{g}}[i] = I_0\)

    We also write \(\overline{\sigma } \xrightarrow {\overline{r}} \overline{\sigma }'\) instead of \((\overline{\sigma }, \overline{r}, \overline{\sigma }') \in \overline{\mathcal {T}}\).

In contrast to prior work [40], the domain of counters is not abstracted in \(\textsf{ACS}\).

Paths of \(\textsf{ACS}\). A sequence \(\overline{\sigma }_0, \overline{r}_0, \overline{\sigma }_1,\ldots ,\overline{\sigma }_{k-1},\overline{r}_{k-1}, \overline{\sigma }_k\) of alternating abstract configurations and rules is called a path of \(\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})\), if \(\overline{\sigma }_0 \in \overline{\varSigma }_0\) and \((\overline{\sigma }_{i},\overline{r}_i,\overline{\sigma }_{i+1}) \in \overline{\mathcal {T}}\) for \(0 \le i \le k\). In this case we also write \(\overline{\sigma }_0 \rightarrow ^* \overline{\sigma }_k\). We denote by \(Paths(\mathsf {ACS(\overline{A})})\) the set of all paths of \(\mathsf {ACS(\overline{A})}\).

Example 4

Let \(I_0=[0,1[,I_1=[1,t[,I_2=[t,n-t[,I_3=[n-t,\infty [ \). The following is a valid path of the abstract counter system of the TA in Fig. 1: \([(4,0,0,0,0)(I_0,I_0)],\overline{r}_0,[(3,0,1,0,0)(I_1,I_0)],\overline{r}_0,[(2,0,2,0,0)(I_2,I_0)],\overline{r}_0\),

\([(1,0,3,0,0)(I_3,I_0)],\overline{r}_2,[(1,0,2,1,0)(I_3,I_0)]\).

Relation Between \(\mathsf {ACS(\overline{A})}\) and \(\mathsf {CS(A)}\). In comparison to \(\textsf{CS}\), in \(\textsf{ACS}\) we drop the resilience condition, as well as the function N that determines the number of processes to be modeled. Moreover, a transition in \(\textsf{ACS}\) may jump from one interval to the next too early and may stay in the same interval although it had to move. We will formalize the relation between the two models in Sect. 4.

3 Specifications

We consider three kinds of specifications: general coverability, which refers to the notion of coverability that is widely used in parameterized verification, e.g., Petri nets [31] or VASS [51], (non-general) coverability and reachability. The latter two are specifications used in prior work on threshold automata [8, 43, 47]. Note that general coverability can express mutual exclusion, e.g., that there cannot be two leaders at the same time, while coverability cannot.

Definition 5

The general parameterized coverability problem is: Given \(\mathsf {CS(A)}\) and a general coverability specification \(\varSigma _{spec} \subseteq \varSigma \), decide if there is a path in \(\mathsf {CS(A)}\) that covers \(\varSigma _{spec}\) (i.e., decide if there is some configuration \(\sigma _r \in \varSigma \) reachable from \(\sigma _0 \in \varSigma _0\) and \(\exists \sigma _{spec} \in \varSigma _{spec}\) where \(\forall i \, \sigma _{spec}.\vec{k}[i] \le \sigma _r.\vec{k}[i]\)).

Definition 6

The parameterized coverability problem is: Given a TA and coverability specification \(L_{spec} = L_{>0}\), decide if there is a path of its \(\mathsf {CS(A)}\) that satisfies \(L_{spec}\) (i.e., decide if there is some configuration \(\sigma _r \in \varSigma \) reachable from \(\sigma _0 \in \varSigma _0\) and \(\sigma _r\) satisfies \(\forall i \in L_{>0} \, \sigma _r.\vec{k}[i] > 0\)).

Definition 7

The parameterized reachability problem is: Given a TA and a reachability specification \(L_{spec} = (L_{=0}, L_{>0})\), decide if there is a path of \(\mathsf {CS(A)}\) that satisfies \(L_{spec}\) (i.e., decide if there is some configuration \(\sigma _r \in \varSigma \) reachable from \(\sigma _0 \in \varSigma _0\) and \(\sigma _r\) satisfies \(\forall i \in L_{=0} \, \sigma _r.\vec{k}[i] = 0\) and \(\forall i \in L_{>0} \, \sigma _r.\vec{k}[i] > 0\)).

We define similarly all three types of problems for an abstract TA and its abstract counter system. Usually, our specifications are definitions of error configurations, and therefore paths that satisfy them, are called error paths.

4 \(\textsf{CS}\) vs \(\textsf{ACS}\)

We now show that the abstraction from \(\textsf{CS}\) to \(\textsf{ACS}\) is complete with respect to the specifications introduced in the previous section.

A path \(\mathsf {\bar{\pi }}= \overline{\sigma }_0, \overline{r}_0,\ldots ,\overline{r}_{m-1}, \overline{\sigma }_m\) in \(\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})\) corresponds to the paths \(\pi = \sigma _0, r_0^{c_0},\ldots ,r_{m-1}^{c_{m-1}}, \sigma _m\) (where \(r_i^{c_i}\) simulates applying \(r_i\) \(c_i\) times) of \(\mathsf {CS(A)}=(\varSigma , \varSigma _0, \mathcal {T})\) that satisfy the following conditions:

  • \( RC \wedge (\textstyle \sum _{j \in \mathcal {I}} \sigma _0.\vec{k}[j] = N(n,t,f)) \)

  • \(\forall i < m \; \sigma _{i}.\vec{k}[r_i.\textit{from}] = c_i + \sigma _{i+1}.\vec{k}[r_i.\textit{from}] \wedge \sigma _{i+1}.\vec{k}[r_i.\rightarrow ] = c_i + \sigma _{i}.\vec{k}[r_i.\rightarrow ] \)

  • \( \forall i < m \; \forall x_j \in \varGamma \; x_j \notin r_i.\tau \implies \sigma _{i+1}.\vec{g}[j] = \sigma _{i}.\vec{g}[j] + c_i \cdot r_i.\vec{uv}[j]\)

  • \(\forall i < m \; \forall x_j \in r_i.\tau \; \sigma _{i+1}.\vec{g}[j] = 0\)

  • \(\forall i < m \; \forall x_j \in \varGamma \; \sigma _{i}.\vec{g}[j] \in \overline{\sigma }_{i}.\overline{\vec{g}}[j] \wedge \sigma _{i+1}.\vec{g}[j] \in \overline{\sigma }_{i+1}.\overline{\vec{g}}[j] \)

  • \(\forall i < m \; c_i > 1 \implies ((\sigma _{i+1}.\vec{g}- r_i.\vec{uv}) \models r_i.\varphi )\)Footnote 1

Let \(Concretize(\mathsf {\bar{\pi }})\) be the conjunction of the constraints above, where quantified formulas are instantiated to a finite conjunction of quantifier-free formulas. Note that \(Concretize(\mathsf {\bar{\pi }})\) is a quantifier-free formula in linear integer arithmetic, and a satisfying assignment for \(Concretize(\mathsf {\bar{\pi }})\) (that can be computed by an SMT solver) represents a path of \(\mathsf {CS(A)}\) that corresponds to \(\mathsf {\bar{\pi }}\). We say that a path \(\mathsf {\bar{\pi }}\in Paths(\mathsf {ACS(\overline{A})})\) is spurious if \(Concretize(\mathsf {\bar{\pi }})\) is unsatisfiable.

For a given \(\mathsf {\bar{\pi }}\in Paths(\mathsf {ACS(\overline{A})})\), let \(Cover(\mathsf {\bar{\pi }})= \forall l \in L \; \sigma _m.\vec{k}[l] \ge \overline{\sigma }_m.\vec{k}[l]\). We can showFootnote 2 the following connection between error paths in \(\mathsf {CS(A)}\) and \(\mathsf {ACS(\overline{A})}\):

Lemma 1

\(\mathsf {ACS(\overline{A})}\) has a non-spurious path that covers a set of configurations \(\varSigma _{spec} \subseteq \varSigma \) iff \(\mathsf {CS(A)}\) has a path that covers \(\varSigma _{spec}\).

Note that Lemma 1 subsumes the case of non-general coverability.

Similarly, if \(\mathsf {\bar{\pi }}\) is a non-spurious path of \(\mathsf {ACS(\overline{A})}\) that satisfies a reachability specification \(L_{spec}\), let \(Reach(\mathsf {\bar{\pi }})= \forall l \in L \; \sigma _m.\vec{k}[l] > 0 \iff \overline{\sigma }_m.\vec{k}[l] > 0\). Then we can show the following with a similar proof as above, where we replace \(Cover(\mathsf {\bar{\pi }})\) with \(Reach(\mathsf {\bar{\pi }})\) and reason accordingly.

Lemma 2

\(\mathsf {ACS(\overline{A})}\) has a non-spurious path that satisfies a reachability specification \(L_{spec}\) iff \(\mathsf {CS(A)}\) has a path that satisfies \(L_{spec}\).

5 Checking General Parameterized Coverability

In this section, we show how abstract counter systems (including ours) can be framed as well-structured transition systems. We also introduce a parameterized model checking algorithm for checking general parameterized coverability in \(\textsf{ACS}\).

5.1 Well-Structured Transition Systems

Well-structured transition systems [32] (WSTS) are a class of infinite-state systems for which the general parameterized coverability problem is decidable [1, 39]. In the following, we recap the standard definitions of WSTS.

Well-Quasi-Order. Given a set \(S\), a binary relation \(\preceq \, \subseteq S\times S\) is a well-quasi-order (wqo) if \(\preceq \) is reflexive, transitive, and if any infinite sequence \(s_0,s_1,\ldots \in S^{\omega }\) contains a pair \(s_i \preceq s_j\) with \(i < j\). A subset \(R \subseteq S\) is an anti-chain if any two distinct elements of R are incomparable wrt. \(\preceq \). Therefore, \(\preceq \) is a wqo on \(S\) if and only if it is reflexive, transitive, and has no infinite anti-chains. The upward closure of a set \(R \subseteq S\), denoted \({\uparrow }R\), is the set \(\{s\in S\mid \exists s' \in R: s' \preceq s\}\). We say that R is upward-closed if \({\uparrow }R = R\), and we call \(B \subseteq S\) a basis of R if \({\uparrow }B = R\). If \(\preceq \) is also anti-symmetric, then any basis of R has a unique subset of minimal elements. We call this set the minimal basis of R, denoted minBasis(R).

Compatibility. Given a transition system \(M=(S,S_0,\mathcal {T})\), we say that a wqo \(\preceq \; \subseteq S\times S\) is compatible with the transition relation \(\mathcal {T}\) if the following holds:

figure c

where \(s \xrightarrow s'\) is a transition in \(\mathcal {T}\), and is a path in M.

WSTS. We say that \((M, \preceq )\) with \(M=(S,S_0,\mathcal {T})\) is a well-structured transition system if \(\preceq \) is a wqo on \(S\) that is compatible with \(\mathcal {T}\). The set of immediate predecessors of a set \(R\subseteq S\) is \(pred(R) = \{s\in S\mid \exists s' \in R: s \xrightarrow s' \}.\) We say that a WSTS \((M,\preceq )\) has effective pred-basis if there exists an algorithm that takes as input any finite set \(R \subseteq S\) and returns a finite basis of \(pred({\uparrow }R)\).

5.2 Abstract Counter Systems as WSTS

To prove that the general parameterized coverability problem is decidable for abstract TAs, we show that an \(\textsf{ACS}\) can be framed as WSTS. Here, for a given set \(E' \subseteq \overline{\varSigma }\) we define \(pred(E') = \{ \overline{\sigma }\in \overline{\varSigma }\mid \overline{\sigma }' \in E' \wedge \exists \overline{r}\in \overline{\mathcal {R}}\text { s.t. } (\overline{\sigma }, \overline{r}, \overline{\sigma }') \in \overline{\mathcal {T}}\}\).

Lemma 3

Given an abstract counter system \(\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})\) let \(\lesssim \; \subseteq \overline{\varSigma }\times \overline{\varSigma }\) be the binary relation defined by:

$$(\vec{k},\overline{\vec{g}}) \lesssim (\vec{k}',\overline{\vec{g}}') \iff \vec{k}\le \vec{k}' \wedge \overline{\vec{g}}= \overline{\vec{g}}'$$

where \(\le \) is the component-wise ordering of vectors. Then \((\mathsf {ACS(\overline{A})},\lessapprox )\) is a WSTS.

Note also that the order \(\lesssim \) is anti-symmetric, and therefore every upward-closed set of configurations has a unique minimal basis.

Lemma 4

Given an abstract counter system \(\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})\), the WSTS \(((\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}}), \lesssim )\) has effective pred-basis.

Let BasisTrans be the transitions from which we computed CPredBasis in the proof of Lemma 4. Concretely, let \(\vec{u}_j\) be the unit vector with \(\vec{u}_j(j) = 1\) and \(\vec{u}_j(i) = 0\) for \(i \ne j\) then \(BasisTrans(E')\) is the set

$$ \left\{ \begin{array}{l|l} ((\vec{k},\overline{\vec{g}}), \overline{r},(\vec{k}',\overline{\vec{g}}')) \in \overline{\mathcal {T}}& (\vec{k}',\overline{\vec{g}}') \in E' \wedge \overline{r}= (l_i,l_j,\overline{\varphi },\vec{uv}, \tau ) \wedge \\ & \left[ (\vec{k},\overline{\vec{g}}) \xrightarrow {\overline{r}} (\vec{k}',\overline{\vec{g}}') \vee ((\vec{k},\overline{\vec{g}}) \xrightarrow {\overline{r}} (\vec{k}' {+} \vec{u}_j,\overline{\vec{g}}'){\wedge }\vec{k}'[j]{=}0) \right] \end{array} \right\} . $$

The following corollary, derived from the aforementioned definitions, will be instrumental in establishing the correctness of our algorithms.

Corollary 1

Given \(\overline{\sigma }'_1,\overline{\sigma }'_2 \in \overline{\varSigma }\), if \(\overline{\sigma }'_1 \lesssim \overline{\sigma }'_2\) then \(\{\overline{r}\in \overline{\mathcal {R}}\mid (\overline{\sigma }_1,\overline{r},\overline{\sigma }'_1) \in BasisTrans(\overline{\sigma }'_1) \} = \{\overline{r}\in \overline{\mathcal {R}}\mid (\overline{\sigma }_2,\overline{r},\overline{\sigma }'_2) \in BasisTrans(\overline{\sigma }'_2) \}\).

5.3 WSTS-Based General Parameterized Coverability Checking

In this section, we present our algorithm for solving the general parameterized coverability problem (see Definition 5). Given an abstract counter system \(\mathsf {ACS(\overline{A})}\) and a finite set of error configurations \(ERR \subseteq \overline{\varSigma }\), we say that a path of \(\mathsf {ACS(\overline{A})}\) is an error path if it starts in \(\overline{\varSigma }_0\) and ends in the upward closure of ERR. Lemma 4 enables us to use the transitive closure of CPredBasis(ERR) to compute a set of error paths in \(\mathsf {ACS(\overline{A})}\) or a fixed-point in which no initial configuration occurs. As defined in Sect. 4, we can examine whether any of these abstract error paths corresponds to error paths in \(\mathsf {CS(A)}\). The detailed algorithm is given in Algorithm 1, which we explain in the following.

Algorithm 1
figure e

General Parameterized Coverability Checking

Procedure takes as argument an \(\mathsf {ACS(\overline{A})}\) and a basis for a set of error configurations ERR. After initializing local variables, the procedure enters a while loop that, given \(E_{i-1}\), invokes  (Line 5), a sub-procedure to compute \(E_i \supseteq minBasis(pred({\uparrow }E_{i-1}))\) such that \(\forall \overline{\sigma }\in {\uparrow }E_i \,\; \forall j <i \,\; \overline{\sigma }\not \in {\uparrow }E_j\). also updates visitedTrans which represents the set of transitions explored so far. The loop breaks once a fixed-point has been reached. Line 7 computes the visited set of initial configurations visitedInitialConfigs. If visitedInitialConfigs is not empty then the procedure extracts all computed error paths (Line 9). starts from the discovered initial configuration(s) and uses visitedTrans to construct the error graph errGraph which encodes all error paths. In a breadth-first fashion, the procedure then unfolds errGraph and evaluates the spuriousness of uncovered error paths that starts in \(\overline{\varSigma }_0\) and ends in ERR (Line 10). A path \(\mathsf {\bar{\pi }}=\overline{\sigma }_0,\overline{r}_0,\ldots ,\overline{r}_{j-1},\overline{\sigma }_j\) in errGraph is spurious if \(Concretize(\mathsf {\bar{\pi }}) \wedge Cover(\mathsf {\bar{\pi }})\) is unsatisfiable (see Sect. 4). In the presence of cyclesFootnote 3, unfolding may not terminate in the presence of shared-variable decrements and/or resets (see Theorem 1). If all error paths are spurious, we conclude that the system is safe. Otherwise a concrete path is returned as a witness of the buggy system (Line 12).

Procedure takes as argument the current set \(E_{i-1}\), and the set of visited transitions visitedTrans. We compute \(E_i\) as the predecessors of \(E_{i-1}\) as explained at the end of Sect. 5.2 (Line 16). At this point, \(E_i\) may contain configurations larger than those we have already explored. To ensure termination, they must be removed. However, before removing these configurations, we collect their visited transitions since these may introduce new, unexplored behaviors. The computation of a comprehensive set of visited transitions, denoted as visitedTrans in this context, is crucial for the correctness of our algorithm. The importance of this lies in the necessity to address spurious counterexamples, which mandates retrieving all paths from an initial configuration to an error configuration. This is in contrast to ordinary backward model checking, where a single counterexample suffices. After adding all computed transitions to visitedTrans (Line 17), we check for each configuration \(\overline{\sigma }\) in \(E_i\) the following:

  • For every configuration \(\overline{\sigma }_s\) with \(\overline{\sigma }_s \lesssim \overline{\sigma }\), we add from \(BasisTrans(E_{i-1})\) transitions that start in \(\overline{\sigma }\) to visitedTrans after replacing \(\overline{\sigma }\) with \(\overline{\sigma }_s\).

  • Otherwise, we add \(\overline{\sigma }\) to \(finalE_i\). Also, for every \(\overline{\sigma }_b \in finalE_i\) with \(\overline{\sigma }\lesssim \overline{\sigma }_b\), we add from \(BasisTrans(E_{i-1})\) transitions that start in \(\overline{\sigma }_b\) to visitedTrans after replacing \(\overline{\sigma }_b\) with \(\overline{\sigma }\), and we remove \(\overline{\sigma }_b\) from \(finalE_i\).

5.4 Correctness

Algorithm 1 is sound, complete, and terminates when the TA is restricted as in [8, 43, 44, 46, 47]. Soundness follows directly from encoding non-spuriousness into the constraint \(Concretize(\mathsf {\bar{\pi }}) \wedge Cover(\mathsf {\bar{\pi }})\), as defined in Sect. 4.

Corollary 2

(Soundness). Algorithm 1 is sound. That is, if the algorithm computes a non-spurious error path, then there is a configuration in \(\uparrow ERR\) that is reachable in \(\mathsf {CS(A)}\).

With Lemma 1, the following lemma proves the algorithm’s completeness.

Lemma 5

(Completeness). If \(\mathsf {ACS(\overline{A})}\) has a non-spurious error path \(\mathsf {\bar{\pi }}\), then Algorithm 1 will find it.

Termination. Since Algorithm 1 is sound and complete, it will terminate whenever \(\textsf{CS}\) has a path that covers \(\varSigma _{spec}\). However, Lines 21 and 25 of the algorithm could create cycles in errGraph. In the presence of cycles however, the algorithm could compute an infinite sequence of error paths. Therefore, we proved the following theorem.

Theorem 1

If \(\forall \overline{r}\in \overline{\mathcal {R}}: \overline{r}.\vec{uv}\in |\mathbb {N}_0|^{|\varGamma |} \wedge \overline{r}.\tau = \emptyset \), then Algorithm 1 terminates.

It is important to note that if errGraph is acyclic, the algorithm is also guaranteed to terminate. This includes the case when errGraph is empty, i.e., the corresponding \(\textsf{ACS}\) has neither spurious nor non-spurious error paths. The algorithm also terminates if no cycle in errGraph has decrements or resets.

6 Reachability via (0, 1)-Abstraction

Two configurations \(\sigma , \sigma '\) may be comparable with respect to order \(\lesssim \) even if some location l is occupied in \(\sigma \) while it is not occupied in \(\sigma '\). This implies that an algorithm based on upward-closed sets wrt. \(\lesssim \) cannot be used to decide reachability properties. Furthermore, note that coverability and reachability specifications (Definition 6 and 7) are agnostic to the precise number of processes in each location, i.e., they require only to distinguish between locations that are occupied by one or more processes and those that are not occupied.

To enable reachability checking and to enhance the performance of our algorithm for (non-general) coverability checking, we introduce a second, similar abstraction of our system model as in [39], where each counter can only assume one of two values: 1 to indicate that the location is currently occupied by at least one process; and 0 to indicate that it is not occupied.

(0, 1)-Configuration. Given an abstract threshold automaton \(\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\) \(\overline{\mathcal {R}})\), a (0, 1)-configuration is a tuple \({\sigma ^{z}}= (\vec{k}^{z}, \overline{\vec{g}})\), where \(\vec{k}^{z}\in \mathbb {B}^{\left| {L} \right| }\), and \(\overline{\vec{g}}\) is defined as before. That is \(\vec{k}^{z}[i]\) indicates the presence (1) or absence (0) of at least one process at location i.

Definition 8

A (0, 1)-counter system (or \(\textsf{ZCS}\)) of \(\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})\), is a transition system \(\mathsf {ZCS(\overline{A})}= (\varSigma ^z, \varSigma ^z_0, \mathcal {T}^{z})\), where:

  • \(\varSigma ^z= \mathbb {B}^{\left| {L} \right| } \times \mathcal {D}^{|\overline{\varGamma }|}\), is the set of (0, 1)-configurations

  • \(\varSigma ^z_0 \subseteq \varSigma ^z\) is the set of (0, 1)-configurations \({\sigma ^{z}}\) that satisfy the following:

    • \(\forall i \in \varGamma : \,\, {\sigma ^{z}}.\overline{\vec{g}}[i] = I_0\)

    • \(\forall i \in L: \,\, {\sigma ^{z}}.\vec{k}^{z}[i]=1 \implies i \in \mathcal {I}\)

  • the transition relation \(\mathcal {T}^{z}\) is the set of transitions \(({\sigma ^{z}}, \overline{r}, {\sigma ^{z}}')\) with:

    • \(\overline{r}=\{\textit{from},\rightarrow ,\overline{\varphi },\vec{uv}\} \in \overline{\mathcal {R}}\)

    • \({\sigma ^{z}}.\overline{\vec{g}}\models \overline{r}.\overline{\varphi }\)

    • \({\sigma ^{z}}.\vec{k}^{z}[\overline{r}.\textit{from}] = 1\) and \(({\sigma ^{z}}'.\vec{k}^{z}[\overline{r}.\textit{from}] = 0\) or \({\sigma ^{z}}'.\vec{k}^{z}[\overline{r}.\textit{from}] = 1)\)

    • \({\sigma ^{z}}'.\vec{k}^{z}[\overline{r}.\rightarrow ] = 1\)

    • \({\sigma ^{z}}'.\overline{\vec{g}}= {\sigma ^{z}}.\overline{\vec{g}}\dot{+} \vec{uv}\)

    • \(\forall x_i \in \overline{r}.\tau : \,\, {\sigma ^{z}}'.\overline{\vec{g}}[i] = I_0\)

Paths. A sequence \({\sigma ^{z}}_0, \overline{r}_0, {\sigma ^{z}}_1,\ldots ,{\sigma ^{z}}_{k-1},\overline{r}_{k-1}, {\sigma ^{z}}_k\) of alternating (0, 1)-configur-ations and abstract rules is a path of \(\mathsf {ZCS(\overline{A})}\) if \(\forall i < k\) we have \(({\sigma ^{z}}_{i},\overline{r}_i,{\sigma ^{z}}_{i+1}) \in \mathcal {T}^{z}\). We denote by \(Paths(\mathsf {ZCS(\overline{A})})\) the set of all paths of \(\mathsf {ZCS(\overline{A})}\).

We say that a 01-configuration \({\sigma ^{z}}\) satisfies a reachability specification \(L_{spec}\), denoted \({\sigma ^{z}}\models L_{spec}\), if for all \(i \in L_{=0}\), \({\sigma ^{z}}.\vec{k}^z[i]=0\), and for all \(i \in L_{>0}\), \({\sigma ^{z}}.\vec{k}^z[i]>0\). We say that \(\mathsf {ZCS(\overline{A})}\) satisfies \(L_{spec}\), denoted \(\mathsf {ZCS(\overline{A})}\models L_{spec}\), if there is a non-spurious path of \(\mathsf {ZCS(\overline{A})}\) that ends in \({\sigma ^{z}}\) with \({\sigma ^{z}}\models L_{spec}\).

Together with Lemma 2, the following lemma shows that our (0, 1)-abstraction is precise for reachability and coverability specifications. In other words, we have \(\mathsf {ZCS(\overline{A})}\models L_{spec}\) if and only if \(\mathsf {CS(A)}\models L_{spec}\), given that \(L_{spec}\) is a reachability or coverability specification.

Lemma 6

Let \(\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})\) be an abstract TA, and \(L_{spec}\) a reachability or coverability specification. We assume w.l.o.g. that \(\mathcal {I}= \{l_0\}\).

Then, there exists a path \({\sigma ^{z}}_0, \overline{r}_0, {\sigma ^{z}}_1, \ldots , \overline{r}_{n-1}, {\sigma ^{z}}_n \in Paths(\mathsf {ZCS(\overline{A})})\) such that \({\sigma ^{z}}_n \models L_{spec}\) if and only if there exists a path \(\overline{\sigma }_0, \overline{r}_0, \overline{\sigma }_1, \ldots , \overline{r}_{m-1}, \overline{\sigma }_m \in Paths(\mathsf {ACS(\overline{A})})\) such that \(\overline{\sigma }_m \models L_{spec}\).

6.1 Parameterized Reachability Algorithm (PRA)

Our algorithm PRA for solving the parameterized reachability problem accepts two input parameters: a 01-counter system \(\mathsf {ZCS(\overline{A})}\) and a finite set of error configurations \(ERR \subseteq \varSigma ^z\). It outputs either “The system is safe” to indicate that no (0, 1)-configuration in ERR is reachable, or an error path of \(\mathsf {ZCS(\overline{A})}\). PRA is very similar to Algorithm 1, differing primarily in the approach to computing the predecessor set. Instead of checking if two configurations are comparable, we look for equality. Moreover, in contrast to Algorithm 1, checking whether an error path \(\mathsf {\pi ^z}\) in \(\mathsf {ZCS(\overline{A})}\) is spurious is based on constraint \(Reach(\mathsf {\pi ^z})\) instead of \(Cover(\mathsf {\pi ^z})\), as described in Sect. 4. More details on PRA can be found in [10].

Just like Algorithm 1, we have demonstrated soundness and completeness of PRA in a general context, along with termination, subject to the restrictions outlined in Theorem 1. Moreover, with minimal adjustments to the reachability algorithm, we can also check coverability within a \(\textsf{ZCS}\) (see again [10]).

7 Implementation and Experimental Evaluation

We implemented Algorithm 1 with explicit, unbounded integer counters and PRA symbolically, leveraging CUDD Decision Diagrams [55] as BDD package. Both use Z3 [25] as SMT solver back-end. We evaluated our implementations on an AMD Ryzen 7 5800X CPU, running at 3.8 GHz with 32 GiB memory of system memory. For comparisons with ByMC [46], the state-of-the-art model checker for threshold automata, we executed the tool in the VM provided by the authors on the same machine, with 6 out of 8 cores and 25GiB of the total system memory. To ensure an equal environment, we also executed our tool in a virtual machine with the same restrictions.Footnote 4

Table 1. Comparison of execution time (in s) between ByMC [46] and our symbolic implementation of PRA. In the Prop. column, A denotes agreement, V denotes validity, and U denotes unforgeability. TO denotes a timeout after 1.5 h.

As benchmarks in our decidable fragment, we used the following threshold-based algorithms from the literature [42]: folklore reliable broadcast (frb) [20], one-step consensus with zero-degradation (cf1s) [18], consensus in one communication step (c1cs) [19], consistent broadcast (strb), asynchronous byzantine agreement (aba) [17], non-blocking atomic commit (nbacr [53] and nbacg [34]), condition-based consensus (cc) [50], and byzantine one step consensus (bosco) [56]. Moreover, we considered the following parts of the Red Belly blockchain [22], which have already been modeled as TA in [12]: the broadcast protocol (RB-bc), the one-round consensus protocol (RB), and a simplified one-round consensus protocol (RB-Simpl).

We verified interesting safety specifications: agreement (consistent decisions among correct processes), validity (the value that has been decided must have been proposed by some process), and unforgeability (if all correct processes have an initial value of 0, then no correct process ever accepts).

Table 1 compares execution times of ByMC and our implementation of PRA. It shows that PRA significantly outperforms ByMC on all benchmarks except two. Except for Red Belly protocols, our non-symbolic implementation can verify all the benchmarks above in less than 2 min.

For the undecidable fragment, we used our extended model of TA to model the following FTDAs from the literature: reliable broadcast [57], k-set agreement [21], multi-round simplified Red Belly blockchain consensus [22], multi-round full Red Belly blockchain consensus [22], and phase king consensus [11]. Note that, in [12], only the one-round protocols were modeled, and only the simple version was verified. In contrast, we can model the multi-round versions, and were able to verify all (single- and multi-round) versions except one.

Table 2. Execution time (in s) for PRA on multi-round protocols. TO denotes a timeout after 2 h.

For the aforementioned TAs, we conducted our experiments on a machine with 2x AMD EPYC 7773x - 128 Cores, 256 Threads and 2TB RAM. Table 2 presents execution for running our symbolic implementation on all the aforementioned multi-round protocols. For the phase-king protocol, PRA was able to locate a bug in an incorrect model of the algorithm within 2 minutes, and was also able to prove partial correctness properties, for example that consensus is actually reachable, in around 5 min. We verified validity for all the remaining benchmarks, and for floodmin we verified additionally agreement. During benchmarking, memory usage peaked at 73 GB.

8 Related Work

After presenting an approach for verifying FTDAs based on abstraction [40], Konnov et al. [43,44,45,46] developed several approaches and algorithms specifically tailored for verifying the safety and liveness properties of threshold automata. Starting from an acyclic CFA (control flow automaton), they construct in [40] a finite counter system using parametric interval abstraction for variables and counters. To refine the abstraction, they detect spurious transitions using ordinary model checking and user-defined invariants. Subsequent works [43, 45] improve on the efficiency when verifying reachability properties in TAs. In [45], they showed the existence of an upper bound on the distance between states within counter systems of TAs, hence, demonstrating the completeness of bounded model checking. In [43], the authors use partial order reduction to generate a finite set of sequences comprising sets of guards and rules. Each of these sequences represents a possibly infinite set of error traces. An SMT solver is employed to validate the existence of concrete error traces. Extending  [43], Konnov et al. presented an approach capable of detecting lasso-shaped traces that violate a given liveness property [44]. The latter two approaches have been implemented in the tool ByMC [46]. To enable the functionality of all aforementioned approaches, the authors found it necessary to impose constraints on threshold automata. This involved explicitly prohibiting cycles and variable decrements.

A verification tool for parameterized distributed algorithms has been introduced in [59]. The tool relies on layered threshold automata [14] as a system model, which can be seen as an infinitely repeating threshold automata. However, the tool requires users to identify layers (rounds) in the model, their sequence (infinite interleaving or lasso-shaped sequences), and to provide predicates.

Decidability and the complexity of verification and synthesis of threshold automata have also been studied in [8]. Their decision procedure is based on an SMT encoding of potential error paths, where in general the size of the SMT formula grows exponentially with the length of the paths. While having achieved good results with some heuristics that avoid this exponential blow-up in practice, we believe that these heuristics would not work for threshold automata with decrements and/or resetsFootnote 5. Moreover, their method requires a bound on the number of changes in the valuation of thresholds. However, such a constraint does not apply in the presence of decrements and/or resets.

9 Conclusion

In this paper, we have introduced an extension of the computational model known as threshold automata, to support decrements and resets of shared variables. This extension in general comes at the cost of decidability, even for simple state-reachability properties. We developed a semi-decision procedure for this extended notion of TA, supporting not only the simple coverability properties from the TA literature, but also general coverability properties as known from Petri nets or well-structured systems. To support also reachability properties, we presented an additional abstraction, called (0, 1)-abstraction, which is the basis for a semi-decision procedure for reachability properties of extended TAs.

We have implemented our techniques and evaluated them on examples from the literature, and on several round-based algorithms that cannot be modeled with canonical TAs. We show that our semi-decision procedure can find bugs in a faulty protocol and prove correctness of protocols, even outside the known decidable fragment. Moreover, on a set of benchmarks in the decidable fragment, it matches or outperforms the TA model checker ByMC [46].