Parameterized Verification of Round-Based Distributed Algorithms via Extended Threshold Automata

Baumeister, Tom; Eichler, Paul; Jacobs, Swen; Sakr, Mouhammad; Völp, Marcus

doi:10.1007/978-3-031-71162-6_33

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14933))

Included in the following conference series:

International Symposium on Formal Methods

23 Accesses

Abstract

Threshold automata are a computational model that has proven to be versatile in modeling threshold-based distributed algorithms and enabling their completely automatic parameterized verification. We present novel techniques for the verification of threshold automata, based on well-structured transition systems, that allow us to extend the expressiveness of both the computational model and the specifications that can be verified. In particular, we extend the model to allow decrements and resets of shared variables, possibly on cycles, and the specifications to general coverability. While these extensions of the model in general lead to undecidability, our algorithms provide a semi-decision procedure. We demonstrate the benefit of our extensions by showing that we can model complex round-based algorithms such as the phase king consensus algorithm and the Red Belly Blockchain protocol (published in 2019), and verify them fully automatically for the first time.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Due to the increasing prevalence and importance of distributed systems in our society, ensuring reliability and correctness of these systems has become paramount. Computer-aided verification of distributed protocols and algorithms has been a very active research area in recent years [35, 37, 49, 52, 60]. To be practical, models of distributed algorithms need to take into account that communication or processes may be faulty, and correctness guarantees should be given based on a resilience condition that defines the quantity and quality of faults (e.g., how many processes may crash or even behave arbitrarily [13, 45, 48]).

Moreover, many distributed systems consist of an arbitrary number of communicating processes, thus requiring parameterized verification techniques that consider the number of processes as a parameter of the system, and that can provide correctness guarantees regardless of this parameter. However, the parameterized verification problem is in general undecidable, even in restricted settings such as identical and anonymous finite-state processes that communicate by passing a binary-valued token in a ring [58].

Since undecidability arises so easily in this setting, the research into automatic parameterized verification can in principle be divided into two directions: (i) identifying decidable classes of systems and properties [1, 3,4,5,6,7, 15, 24, 26,27,28,29,30, 32, 33, 36, 38, 41, 54], which are often restricted to rather specialized use cases, and (ii) developing semi-decision procedures [2, 16], which are usually much more versatile but come without a termination guarantee. In practice however, the line between these two approaches is not so clear. The line blurs when semi-decision procedures serve as decision procedures for a certain fragment of their possible inputs, and on the other hand, many decidable problems in parameterized verification have a huge complexity [24, 54], and to an engineer it does not make a difference if the verification algorithm is guaranteed to terminate within a thousand years, or whether it comes without a termination guarantee.

In this paper, we aim for practical verification techniques that cover a large class of fault-tolerant distributed algorithms. Our focus is not decidability, but the development of techniques that can in practice verify a wide range of distributed algorithms, with full automation. We build on the formal model of threshold automata (TA) and existing techniques to verify them, which we combine with insights from the theory of well-structured transition systems (WSTS) [32] and with abstraction techniques. This combination allows us to lift some of the restrictions of the existing decidable fragments for TAs, and to verify distributed algorithms that are not supported by the existing techniques.

In particular, existing verification techniques for TAs are usually restricted to reason about shared variables that are monotonically increasing or decreasing over any run of the automaton. An extension of the model that allows both increasing and decreasing variables has been considered before [47], but only to show that the problem becomes undecidable, not to develop a verification technique for this case. In our setting, we can allow increments, decrements and resets of shared variables, while still obtaining a semi-decision procedure, and even a decision procedure in certain cases. In this paper, we demonstrate how this allows us to reason about round-based algorithms such as the phase king consensus algorithm [11] and the Red Belly Blockchain algorithm [23], which both use resets of shared variables at the beginning of a round, without a fixed bound on the number of rounds to be executed.

Contributions. In this paper,

1.
we present an extension of threshold automata that allows increments, decrements and resets of shared variables (Sect. 2),
2.
we develop a technique (based on well-structured transition systems) that is a decision procedure for general coverability properties of canonical threshold automata [47], and is a semi-decision procedure for our extension (Sect. 3–5),
3.
we develop an additional abstraction that reduces the search space of this technique and additionally allows us to check another type of specification, called reachability properties, on extended threshold automata (Sect. 6),
4.
we implement our techniques and demonstrate their performance on a number of examples from the literature, several of which—including the phase king consensus algorithm and the state-of-the-art Red Belly Blockchain algorithm—cannot be modeled in canonical threshold automata (Sect. 7).

2 System Model

In this section, we build on the existing notion of threshold automata (TAs) [45] and generalize their notion of shared variables. TAs are a model of distributed computation that encodes information exchange between processes into a fixed set of shared variables. Shared variables in TAs are usually required to be monotonically increasing along an execution to ensure decidability. We extend the definition such that it permits shared variables to be decreased or reset, which in general introduces undecidability [47]. We then define the semantics of an unbounded number of such TAs running in parallel. In Sect. 2.1, we introduce the notion of an abstract TA and provide the semantics for this abstraction.

Definition 1

A threshold automaton is a tuple $A=(L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)$ where:

L is a finite set of locations.
$ \mathcal {I}\subseteq L$ is the set of initial locations.
$\varGamma = \{x_0,\ldots ,x_m\}$ is a finite set of shared variables over $\mathbb {N}_0$.
$\varPi $ is a finite set of parameter variables over $\mathbb {N}_0$. Usually, $\varPi = \{n,t,f\}$, where n is the total number of processes, t is a bound on the number of tolerated faulty processes , and f is the actual number of faulty processes.
RC, the resilience condition, is a linear integer arithmetic formula over parameter variables. E.g.: $RC = n > 3t\, \wedge \, t \ge f$. For a vector $\vec{p}\in \mathbb {N}_0^{|\varPi |}$, we write $\vec{p}\models RC$ if RC holds after substituting parameter variables with values according to $\vec{p}$. Then the set of admissible parameters is $\vec{P}_{RC} = \{\vec{p}\in \mathbb {N}_0^{|\varPi |} : \vec{p}\models RC\}$.
$\mathcal {R}$ is a set of rules where a rule is a tuple $r=(\textit{from}, \rightarrow , \varphi ,\vec{uv},\tau )$ such that:
- $\textit{from},\rightarrow \in L$.
- $\vec{uv}\in |\mathbb {Z}|^{|\varGamma |}$ is an update vector for shared variables.
- $\varphi $ is a conjunction of lower guards and upper guards. A lower guard has the form: $a_0 + \sum _{i=1}^{|\varPi |} a_i \cdot p_i \le x$; An upper guard has the form: $ a_0 + \sum _{i=1}^{|\varPi |} a_i \cdot p_i > x$, with $x \in \varGamma $, $a_0,\ldots ,a_{|\varPi |} \in \mathbb {Q}$, $p_1,\ldots ,p_{|\varPi |} \in \varPi $. The left-hand side of a lower or upper guard is called a threshold.
- $\tau \subseteq \varGamma $ is the set of shared variables to be reset to 0.

Example 1

Figure 1 sketches a threshold automaton with $\mathcal {I}=\{v_0,v_1\}$, $L = \{v_0,v_1, Wait, d_0,d_1\}$, $\varGamma =\{x_0,x_1\}$, $\varPi =\{n,t,f\}$. A process in $v_0$ has a vote of 0 and a process in $v_1$ has a vote of 1. If at least $n-t$ processes vote with 0 (respectively, 1), the decision will be 0 (1), modeled by all processes moving to $d_0$ ($d_1$).

Semantics of TA. Given a TA $A = (L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)$, let the function $N: \, \vec{P}_{RC} \rightarrow \mathbb {N}_0$ determine the number of processes to be modelled (usually, $N(n,t,f) = n-f$). Then, the concrete semantics of a system composed of $N(\vec{p})$ threshold automata running in parallel are defined via a counter system.

Definition 2

A counter system ($\textsf{CS}$) of a TA $A = (L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)$ is a transition system $\mathsf {CS(A)}= (\varSigma , \varSigma _0, \mathcal {T})$ where

$\varSigma $ is the set of configurations. A configuration is a tuple $\sigma =(\vec{k},\vec{g},\vec{p})$ where:
- $\vec{k}\in \mathbb {N}_0^{|L|}$ is a vector of counter values, where $\vec{k}[i]$ represents the number of processes in location i. We refer to locations by their indices in L.
- $\vec{g}\in \mathbb {N}_0^{|\varGamma |}$ is a vector of shared variables’ values, where $\vec{g}[i]$ is the value of variable $x_i \in \varGamma $.
- $\vec{p}\in \vec{P}_{RC}$ is an admissible vector of parameter values.
The set $\varSigma _0$ contains all initial configurations, i.e., configurations that satisfy
$$\begin{aligned} \forall x_i \in \varGamma : \,\, \sigma .\vec{g}[i] = 0 \text { and } \sum _{i \in \mathcal {I}} \sigma .\vec{k}[i] = N(\vec{p}) \text { and } \sum _{i \not \in \mathcal {I}} \sigma .\vec{k}[i] = 0 \end{aligned}$$
$\mathcal {T}\subseteq \varSigma \times \mathcal {R}\times \varSigma $ is the set of transitions, where $(\sigma , r, \sigma ') \in \mathcal {T}$ if and only if all of the following conditions hold:
- $\sigma '.\vec{p}= \sigma .\vec{p}$ (parameter values never change).
- $\sigma '.\vec{k}[r.\rightarrow ] = \sigma .\vec{k}[r.\rightarrow ] + 1$ (one process moves to $r.\rightarrow $).
- $\sigma '.\vec{k}[r.\textit{from}] = \sigma .\vec{k}[r.\textit{from}] - 1$ (one process moves out of $r.\textit{from}$)
- $\sigma .\vec{g}\models r.\varphi $ (i.e., $\varphi $ holds after replacing shared variables with values $\sigma .\vec{g}$)
- $\sigma '.\vec{g}= \sigma .\vec{g}+ r.\vec{uv}$
- $\forall x_i \in \tau $ $\sigma '.\vec{g}[i] = 0$
Instead of $(\sigma , r, \sigma ') \in \mathcal {T}$ we also write $\sigma \xrightarrow {r} \sigma '$. If $(\sigma , r, \sigma ') \in \mathcal {T}$, we say r is enabled in $\sigma $; otherwise it is disabled.

Paths of $\textsf{CS}$. A sequence $\sigma _0, r_0, \sigma _1,\ldots ,\sigma _{k-1},r_{k-1}, \sigma _k$ of alternating configurations and rules is a path of a counter system $\mathsf {CS(A)}= (\varSigma , \varSigma _0, \mathcal {T})$ if and only if $\sigma _0 \in \varSigma _0$ and $(\sigma _{i},r_i,\sigma _{i+1}) \in \mathcal {T}$ for $0 \le i < k$. In this case we also write $\sigma _0 \rightarrow ^* \sigma _k$. We denote by $Paths(\mathsf {CS(A)})$ the set of all paths of $\mathsf {CS(A)}$.

Example 2

Let $N(n,t,f) = n-f$, $RC = n > 3t \wedge t \ge f$, then the following is a valid path of the counter system of the TA in Fig. 1:

$[(4,0,0,0,0)(0,0)],r_0,[(3,0,1,0,0)(1,0)],r_0,[(2,0,2,0,0)(2,0)],r_0$,

$[(1,0,3,0,0)(3,0)],r_0,[(0,0,4,0,0)(4,0)],r_2,[(0,0,3,1,0)(4,0)]$.

2.1 Abstract Threshold Automata

The shared variables and parameters of a TA have infinite domains. To facilitate the application of parameterized verification techniques for finite-state processes, we introduce an abstraction of TAs based on parametric interval abstraction [40].

Abstract Domain. Given a TA A, define as $\mathcal{T}\mathcal{H}=\{d_0,d_1,\ldots ,d_k\}$ the set of thresholds where $d_0=0,d_1=1$ and $\forall i >1$ $d_i$ is a threshold in A. We assume that $\forall i,j$ $d_i < d_j$ if $i < j$. Note that this is always possible for a fixed $\vec{p}\in \vec{P}_{RC}$. If different $\vec{p}\in \vec{P}_{RC}$ result in different orders of the $d_i$, then we consider each of the finitely many such orders separately. Based on this, define the finite set of intervals $\mathcal {D}=\{I_0,I_1,\ldots ,I_k\}$ where $I_i = [d_i,d_{i+1}[$ if $i<k$, and $I_k = [d_k,\infty [$.

Definition 3

Abstract Threshold Automata. Given a threshold automaton $A=(L, \mathcal {I}, \varGamma , \varPi ,\mathcal {R}, RC)$, we define the abstract threshold automaton (or $\overline{\text {TA}}$) $\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})$ where:

A and $\overline{A}$ share the components $L,\mathcal {I},\varPi $.
Let $\varGamma = \{x_0,\ldots ,x_m\}$, then $\overline{\varGamma }= \{\overline{x}_0,\ldots ,\overline{x}_m\}$, where each $\overline{x}_i$ is over the domain $\mathcal {D}=\{I_0,I_1,\ldots ,I_k\}$.
$\overline{\mathcal {R}}$ is the set of abstract rules. An abstract rule is a tuple $\overline{r}=(\textit{from}, \rightarrow , \overline{\varphi }, \vec{uv},$ $ \tau )$ where $\textit{from}, \rightarrow , \vec{uv}, \tau $ are as before, and the abstract guard $\overline{\varphi }$ is a Boolean expression over equalities between shared variables and abstract values. Formally, let $\varphi = \varphi _0 \wedge \ldots \wedge \varphi _k$, then $\overline{\varphi }= \overline{\varphi }_0 \wedge \ldots \wedge \overline{\varphi }_k$ where for $\varphi _i = (d_j \le x)$, we have $\overline{\varphi }_i = \bigvee _{c=j}^{k-1} (\overline{x}= [d_c,d_{c+1}[) \vee \overline{x}= [d_k,\infty [$, and for $\varphi _i = (d_j > x)$, we have $\overline{\varphi }_i = \bigvee _{c=0}^{j-1} (\overline{x}= [d_c,d_{c+1}[)$.

Example 3

Consider again the TA in Fig. 1 with $N(n,t,f) = n-f$, $RC = n > 3t \wedge t \ge f > 1$. We have $\mathcal{T}\mathcal{H}=\{0,1,t,n-t\}$ and $\mathcal {D}=\{[0,1[,[1,t[,[t,n-t[, [n-t,\infty [\}$ whose order is induced by RC. Moreover, we have $\overline{r}_0 = r_0,\overline{r}_1 = r_1$ (due to the absence of a guard), $\overline{r}_2.\overline{\varphi }= (\overline{x}_0 = [n-t,\infty [)$, $\overline{r}_3.\overline{\varphi }= (\overline{x}_1 = [n-t,\infty [)$.

To keep the presentation simple, in our definition all shared variables have the same abstract domain. The abstraction can be improved by considering different abstract domains for different variables: for a given shared variable x we can let $\mathcal{T}\mathcal{H}_x=\{d_0,d_1,\ldots ,d_l\}$ where $d_0=0,d_1=1$ and $\forall i >1$ there is a guard $d_i * x$ with $* \in \{\ge , <\}$, to obtain a corresponding abstract domain $\mathcal {D}_x$ for x.

2.1.1 Semantics of $\overline{\text {TA}}$.

We first over-approximate the semantics of a system composed of an arbitrary number of $\overline{\text {TA}}$s by an abstract counter system. We later show how to detect whether a behavior of the abstract counter system corresponds to a concrete behavior of a counter system.

Definition 4

An abstract counter system ($\textsf{ACS}$) of $\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})$ is a transition system $\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})$ where:

$\overline{\varSigma }$ is the set of abstract configurations. A configuration of $\mathsf {ACS(\overline{A})}$ is a tuple $\overline{\sigma }=(\vec{k},\overline{\vec{g}})$ where:
- $\vec{k}\in \mathbb {N}_0^{|L|}$ is a vector of counter values where $\vec{k}[i]$ represents the number of processes in location i.
- $\overline{\vec{g}}\in \mathcal {D}^{|\overline{\varGamma }|}$ is a vector of shared variables values, where $\overline{\vec{g}}[i]$ is the parametric interval currently assigned to $\overline{x}_i$.
The set $\overline{\varSigma }_0$ contains all initial abstract configurations, i.e., those that satisfy
$$\begin{aligned} \forall \overline{x}_i \in \overline{\varGamma }: \,\, \overline{\sigma }.\overline{\vec{g}}[i] = I_0 \text { and } \sum _{i \in \mathcal {I}} \overline{\sigma }.\vec{k}[i] \ge 0 \text { and } \sum _{i \not \in \mathcal {I}} \overline{\sigma }.\vec{k}[i] = 0 \end{aligned}$$
$\overline{\mathcal {T}}\subseteq \overline{\varSigma }\times \overline{\mathcal {R}}\times \overline{\varSigma }$ is the set of transitions. A transition is a tuple $t=(\overline{\sigma }, \overline{r}, \overline{\sigma }')$ where:
- $\overline{\sigma }'.\vec{k}[\overline{r}.\rightarrow ] = \overline{\sigma }.\vec{k}[\overline{r}.\rightarrow ] + 1$
- $\overline{\sigma }'.\vec{k}[\overline{r}.\textit{from}] = \overline{\sigma }.\vec{k}[\overline{r}.\textit{from}] - 1$
- $\overline{\sigma }.\overline{\vec{g}}\models \overline{r}.\overline{\varphi }$
- $\overline{\sigma }'.\overline{\vec{g}}= \overline{\sigma }.\overline{\vec{g}}\dot{+} \overline{r}.\vec{uv}$, defined as follows: $\forall i < |\varGamma | $:
  1. 1.
    $\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i]$, if $\overline{r}.\vec{uv}[i] = 0$
  2. 2.
    $(\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i]) \vee (\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i].next)$, if $\overline{r}.\vec{uv}[i] = 1$
  3. 3.
    $(\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i]) \vee (\overline{\sigma }'.\overline{\vec{g}}[i] = \overline{\sigma }.\overline{\vec{g}}[i].previous)$, if $\overline{r}.\vec{uv}[i] = -1$
    
    the first disjunct in 2 and the second in 3 are omitted if $\overline{\sigma }.\overline{\vec{g}}[i] = I_0$.
  4. 4.
    $\forall x_i \in \overline{r}.\tau : \,\, \overline{\sigma }'.\overline{\vec{g}}[i] = I_0$
We also write $\overline{\sigma } \xrightarrow {\overline{r}} \overline{\sigma }'$ instead of $(\overline{\sigma }, \overline{r}, \overline{\sigma }') \in \overline{\mathcal {T}}$.

In contrast to prior work [40], the domain of counters is not abstracted in $\textsf{ACS}$.

Paths of $\textsf{ACS}$. A sequence $\overline{\sigma }_0, \overline{r}_0, \overline{\sigma }_1,\ldots ,\overline{\sigma }_{k-1},\overline{r}_{k-1}, \overline{\sigma }_k$ of alternating abstract configurations and rules is called a path of $\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})$, if $\overline{\sigma }_0 \in \overline{\varSigma }_0$ and $(\overline{\sigma }_{i},\overline{r}_i,\overline{\sigma }_{i+1}) \in \overline{\mathcal {T}}$ for $0 \le i \le k$. In this case we also write $\overline{\sigma }_0 \rightarrow ^* \overline{\sigma }_k$. We denote by $Paths(\mathsf {ACS(\overline{A})})$ the set of all paths of $\mathsf {ACS(\overline{A})}$.

Example 4

Let $I_0=[0,1[,I_1=[1,t[,I_2=[t,n-t[,I_3=[n-t,\infty [ $. The following is a valid path of the abstract counter system of the TA in Fig. 1: $[(4,0,0,0,0)(I_0,I_0)],\overline{r}_0,[(3,0,1,0,0)(I_1,I_0)],\overline{r}_0,[(2,0,2,0,0)(I_2,I_0)],\overline{r}_0$,

$[(1,0,3,0,0)(I_3,I_0)],\overline{r}_2,[(1,0,2,1,0)(I_3,I_0)]$.

Relation Between $\mathsf {ACS(\overline{A})}$ and $\mathsf {CS(A)}$. In comparison to $\textsf{CS}$, in $\textsf{ACS}$ we drop the resilience condition, as well as the function N that determines the number of processes to be modeled. Moreover, a transition in $\textsf{ACS}$ may jump from one interval to the next too early and may stay in the same interval although it had to move. We will formalize the relation between the two models in Sect. 4.

3 Specifications

We consider three kinds of specifications: general coverability, which refers to the notion of coverability that is widely used in parameterized verification, e.g., Petri nets [31] or VASS [51], (non-general) coverability and reachability. The latter two are specifications used in prior work on threshold automata [8, 43, 47]. Note that general coverability can express mutual exclusion, e.g., that there cannot be two leaders at the same time, while coverability cannot.

Definition 5

The general parameterized coverability problem is: Given $\mathsf {CS(A)}$ and a general coverability specification $\varSigma _{spec} \subseteq \varSigma $, decide if there is a path in $\mathsf {CS(A)}$ that covers $\varSigma _{spec}$ (i.e., decide if there is some configuration $\sigma _r \in \varSigma $ reachable from $\sigma _0 \in \varSigma _0$ and $\exists \sigma _{spec} \in \varSigma _{spec}$ where $\forall i \, \sigma _{spec}.\vec{k}[i] \le \sigma _r.\vec{k}[i]$).

Definition 6

The parameterized coverability problem is: Given a TA and coverability specification $L_{spec} = L_{>0}$, decide if there is a path of its $\mathsf {CS(A)}$ that satisfies $L_{spec}$ (i.e., decide if there is some configuration $\sigma _r \in \varSigma $ reachable from $\sigma _0 \in \varSigma _0$ and $\sigma _r$ satisfies $\forall i \in L_{>0} \, \sigma _r.\vec{k}[i] > 0$).

Definition 7

The parameterized reachability problem is: Given a TA and a reachability specification $L_{spec} = (L_{=0}, L_{>0})$, decide if there is a path of $\mathsf {CS(A)}$ that satisfies $L_{spec}$ (i.e., decide if there is some configuration $\sigma _r \in \varSigma $ reachable from $\sigma _0 \in \varSigma _0$ and $\sigma _r$ satisfies $\forall i \in L_{=0} \, \sigma _r.\vec{k}[i] = 0$ and $\forall i \in L_{>0} \, \sigma _r.\vec{k}[i] > 0$).

We define similarly all three types of problems for an abstract TA and its abstract counter system. Usually, our specifications are definitions of error configurations, and therefore paths that satisfy them, are called error paths.

4 $\textsf{CS}$ vs $\textsf{ACS}$

We now show that the abstraction from $\textsf{CS}$ to $\textsf{ACS}$ is complete with respect to the specifications introduced in the previous section.

A path $\mathsf {\bar{\pi }}= \overline{\sigma }_0, \overline{r}_0,\ldots ,\overline{r}_{m-1}, \overline{\sigma }_m$ in $\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})$ corresponds to the paths $\pi = \sigma _0, r_0^{c_0},\ldots ,r_{m-1}^{c_{m-1}}, \sigma _m$ (where $r_i^{c_i}$ simulates applying $r_i$ $c_i$ times) of $\mathsf {CS(A)}=(\varSigma , \varSigma _0, \mathcal {T})$ that satisfy the following conditions:

$ RC \wedge (\textstyle \sum _{j \in \mathcal {I}} \sigma _0.\vec{k}[j] = N(n,t,f)) $
$\forall i < m \; \sigma _{i}.\vec{k}[r_i.\textit{from}] = c_i + \sigma _{i+1}.\vec{k}[r_i.\textit{from}] \wedge \sigma _{i+1}.\vec{k}[r_i.\rightarrow ] = c_i + \sigma _{i}.\vec{k}[r_i.\rightarrow ] $
$ \forall i < m \; \forall x_j \in \varGamma \; x_j \notin r_i.\tau \implies \sigma _{i+1}.\vec{g}[j] = \sigma _{i}.\vec{g}[j] + c_i \cdot r_i.\vec{uv}[j]$
$\forall i < m \; \forall x_j \in r_i.\tau \; \sigma _{i+1}.\vec{g}[j] = 0$
$\forall i < m \; \forall x_j \in \varGamma \; \sigma _{i}.\vec{g}[j] \in \overline{\sigma }_{i}.\overline{\vec{g}}[j] \wedge \sigma _{i+1}.\vec{g}[j] \in \overline{\sigma }_{i+1}.\overline{\vec{g}}[j] $
$\forall i < m \; c_i > 1 \implies ((\sigma _{i+1}.\vec{g}- r_i.\vec{uv}) \models r_i.\varphi )$^{Footnote 1}

Let $Concretize(\mathsf {\bar{\pi }})$ be the conjunction of the constraints above, where quantified formulas are instantiated to a finite conjunction of quantifier-free formulas. Note that $Concretize(\mathsf {\bar{\pi }})$ is a quantifier-free formula in linear integer arithmetic, and a satisfying assignment for $Concretize(\mathsf {\bar{\pi }})$ (that can be computed by an SMT solver) represents a path of $\mathsf {CS(A)}$ that corresponds to $\mathsf {\bar{\pi }}$. We say that a path $\mathsf {\bar{\pi }}\in Paths(\mathsf {ACS(\overline{A})})$ is spurious if $Concretize(\mathsf {\bar{\pi }})$ is unsatisfiable.

For a given $\mathsf {\bar{\pi }}\in Paths(\mathsf {ACS(\overline{A})})$, let $Cover(\mathsf {\bar{\pi }})= \forall l \in L \; \sigma _m.\vec{k}[l] \ge \overline{\sigma }_m.\vec{k}[l]$. We can show^{Footnote 2} the following connection between error paths in $\mathsf {CS(A)}$ and $\mathsf {ACS(\overline{A})}$:

Lemma 1

$\mathsf {ACS(\overline{A})}$ has a non-spurious path that covers a set of configurations $\varSigma _{spec} \subseteq \varSigma $ iff $\mathsf {CS(A)}$ has a path that covers $\varSigma _{spec}$.

Note that Lemma 1 subsumes the case of non-general coverability.

Similarly, if $\mathsf {\bar{\pi }}$ is a non-spurious path of $\mathsf {ACS(\overline{A})}$ that satisfies a reachability specification $L_{spec}$, let $Reach(\mathsf {\bar{\pi }})= \forall l \in L \; \sigma _m.\vec{k}[l] > 0 \iff \overline{\sigma }_m.\vec{k}[l] > 0$. Then we can show the following with a similar proof as above, where we replace $Cover(\mathsf {\bar{\pi }})$ with $Reach(\mathsf {\bar{\pi }})$ and reason accordingly.

Lemma 2

$\mathsf {ACS(\overline{A})}$ has a non-spurious path that satisfies a reachability specification $L_{spec}$ iff $\mathsf {CS(A)}$ has a path that satisfies $L_{spec}$.

5 Checking General Parameterized Coverability

In this section, we show how abstract counter systems (including ours) can be framed as well-structured transition systems. We also introduce a parameterized model checking algorithm for checking general parameterized coverability in $\textsf{ACS}$.

5.1 Well-Structured Transition Systems

Well-structured transition systems [32] (WSTS) are a class of infinite-state systems for which the general parameterized coverability problem is decidable [1, 39]. In the following, we recap the standard definitions of WSTS.

Well-Quasi-Order. Given a set $S$, a binary relation $\preceq \, \subseteq S\times S$ is a well-quasi-order (wqo) if $\preceq $ is reflexive, transitive, and if any infinite sequence $s_0,s_1,\ldots \in S^{\omega }$ contains a pair $s_i \preceq s_j$ with $i < j$. A subset $R \subseteq S$ is an anti-chain if any two distinct elements of R are incomparable wrt. $\preceq $. Therefore, $\preceq $ is a wqo on $S$ if and only if it is reflexive, transitive, and has no infinite anti-chains. The upward closure of a set $R \subseteq S$, denoted ${\uparrow }R$, is the set $\{s\in S\mid \exists s' \in R: s' \preceq s\}$. We say that R is upward-closed if ${\uparrow }R = R$, and we call $B \subseteq S$ a basis of R if ${\uparrow }B = R$. If $\preceq $ is also anti-symmetric, then any basis of R has a unique subset of minimal elements. We call this set the minimal basis of R, denoted minBasis(R).

Compatibility. Given a transition system $M=(S,S_0,\mathcal {T})$, we say that a wqo $\preceq \; \subseteq S\times S$ is compatible with the transition relation $\mathcal {T}$ if the following holds:

where $s \xrightarrow s'$ is a transition in $\mathcal {T}$, and is a path in M.

WSTS. We say that $(M, \preceq )$ with $M=(S,S_0,\mathcal {T})$ is a well-structured transition system if $\preceq $ is a wqo on $S$ that is compatible with $\mathcal {T}$. The set of immediate predecessors of a set $R\subseteq S$ is $pred(R) = \{s\in S\mid \exists s' \in R: s \xrightarrow s' \}.$ We say that a WSTS $(M,\preceq )$ has effective pred-basis if there exists an algorithm that takes as input any finite set $R \subseteq S$ and returns a finite basis of $pred({\uparrow }R)$.

5.2 Abstract Counter Systems as WSTS

To prove that the general parameterized coverability problem is decidable for abstract TAs, we show that an $\textsf{ACS}$ can be framed as WSTS. Here, for a given set $E' \subseteq \overline{\varSigma }$ we define $pred(E') = \{ \overline{\sigma }\in \overline{\varSigma }\mid \overline{\sigma }' \in E' \wedge \exists \overline{r}\in \overline{\mathcal {R}}\text { s.t. } (\overline{\sigma }, \overline{r}, \overline{\sigma }') \in \overline{\mathcal {T}}\}$.

Lemma 3

Given an abstract counter system $\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})$ let $\lesssim \; \subseteq \overline{\varSigma }\times \overline{\varSigma }$ be the binary relation defined by:

$$(\vec{k},\overline{\vec{g}}) \lesssim (\vec{k}',\overline{\vec{g}}') \iff \vec{k}\le \vec{k}' \wedge \overline{\vec{g}}= \overline{\vec{g}}'$$

where $\le $ is the component-wise ordering of vectors. Then $(\mathsf {ACS(\overline{A})},\lessapprox )$ is a WSTS.

Note also that the order $\lesssim $ is anti-symmetric, and therefore every upward-closed set of configurations has a unique minimal basis.

Lemma 4

Given an abstract counter system $\mathsf {ACS(\overline{A})}= (\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}})$, the WSTS $((\overline{\varSigma }, \overline{\varSigma }_0, \overline{\mathcal {T}}), \lesssim )$ has effective pred-basis.

Let BasisTrans be the transitions from which we computed CPredBasis in the proof of Lemma 4. Concretely, let $\vec{u}_j$ be the unit vector with $\vec{u}_j(j) = 1$ and $\vec{u}_j(i) = 0$ for $i \ne j$ then $BasisTrans(E')$ is the set

$$ \left\{ \begin{array}{l|l} ((\vec{k},\overline{\vec{g}}), \overline{r},(\vec{k}',\overline{\vec{g}}')) \in \overline{\mathcal {T}}& (\vec{k}',\overline{\vec{g}}') \in E' \wedge \overline{r}= (l_i,l_j,\overline{\varphi },\vec{uv}, \tau ) \wedge \\ & \left[ (\vec{k},\overline{\vec{g}}) \xrightarrow {\overline{r}} (\vec{k}',\overline{\vec{g}}') \vee ((\vec{k},\overline{\vec{g}}) \xrightarrow {\overline{r}} (\vec{k}' {+} \vec{u}_j,\overline{\vec{g}}'){\wedge }\vec{k}'[j]{=}0) \right] \end{array} \right\} . $$

The following corollary, derived from the aforementioned definitions, will be instrumental in establishing the correctness of our algorithms.

Corollary 1

Given $\overline{\sigma }'_1,\overline{\sigma }'_2 \in \overline{\varSigma }$, if $\overline{\sigma }'_1 \lesssim \overline{\sigma }'_2$ then $\{\overline{r}\in \overline{\mathcal {R}}\mid (\overline{\sigma }_1,\overline{r},\overline{\sigma }'_1) \in BasisTrans(\overline{\sigma }'_1) \} = \{\overline{r}\in \overline{\mathcal {R}}\mid (\overline{\sigma }_2,\overline{r},\overline{\sigma }'_2) \in BasisTrans(\overline{\sigma }'_2) \}$.

5.3 WSTS-Based General Parameterized Coverability Checking

In this section, we present our algorithm for solving the general parameterized coverability problem (see Definition 5). Given an abstract counter system $\mathsf {ACS(\overline{A})}$ and a finite set of error configurations $ERR \subseteq \overline{\varSigma }$, we say that a path of $\mathsf {ACS(\overline{A})}$ is an error path if it starts in $\overline{\varSigma }_0$ and ends in the upward closure of ERR. Lemma 4 enables us to use the transitive closure of CPredBasis(ERR) to compute a set of error paths in $\mathsf {ACS(\overline{A})}$ or a fixed-point in which no initial configuration occurs. As defined in Sect. 4, we can examine whether any of these abstract error paths corresponds to error paths in $\mathsf {CS(A)}$. The detailed algorithm is given in Algorithm 1, which we explain in the following.

Procedure takes as argument an $\mathsf {ACS(\overline{A})}$ and a basis for a set of error configurations ERR. After initializing local variables, the procedure enters a while loop that, given $E_{i-1}$, invokes (Line 5), a sub-procedure to compute $E_i \supseteq minBasis(pred({\uparrow }E_{i-1}))$ such that $\forall \overline{\sigma }\in {\uparrow }E_i \,\; \forall j <i \,\; \overline{\sigma }\not \in {\uparrow }E_j$. also updates visitedTrans which represents the set of transitions explored so far. The loop breaks once a fixed-point has been reached. Line 7 computes the visited set of initial configurations visitedInitialConfigs. If visitedInitialConfigs is not empty then the procedure extracts all computed error paths (Line 9). starts from the discovered initial configuration(s) and uses visitedTrans to construct the error graph errGraph which encodes all error paths. In a breadth-first fashion, the procedure then unfolds errGraph and evaluates the spuriousness of uncovered error paths that starts in $\overline{\varSigma }_0$ and ends in ERR (Line 10). A path $\mathsf {\bar{\pi }}=\overline{\sigma }_0,\overline{r}_0,\ldots ,\overline{r}_{j-1},\overline{\sigma }_j$ in errGraph is spurious if $Concretize(\mathsf {\bar{\pi }}) \wedge Cover(\mathsf {\bar{\pi }})$ is unsatisfiable (see Sect. 4). In the presence of cycles^{Footnote 3}, unfolding may not terminate in the presence of shared-variable decrements and/or resets (see Theorem 1). If all error paths are spurious, we conclude that the system is safe. Otherwise a concrete path is returned as a witness of the buggy system (Line 12).

Procedure takes as argument the current set $E_{i-1}$, and the set of visited transitions visitedTrans. We compute $E_i$ as the predecessors of $E_{i-1}$ as explained at the end of Sect. 5.2 (Line 16). At this point, $E_i$ may contain configurations larger than those we have already explored. To ensure termination, they must be removed. However, before removing these configurations, we collect their visited transitions since these may introduce new, unexplored behaviors. The computation of a comprehensive set of visited transitions, denoted as visitedTrans in this context, is crucial for the correctness of our algorithm. The importance of this lies in the necessity to address spurious counterexamples, which mandates retrieving all paths from an initial configuration to an error configuration. This is in contrast to ordinary backward model checking, where a single counterexample suffices. After adding all computed transitions to visitedTrans (Line 17), we check for each configuration $\overline{\sigma }$ in $E_i$ the following:

For every configuration $\overline{\sigma }_s$ with $\overline{\sigma }_s \lesssim \overline{\sigma }$, we add from $BasisTrans(E_{i-1})$ transitions that start in $\overline{\sigma }$ to visitedTrans after replacing $\overline{\sigma }$ with $\overline{\sigma }_s$.
Otherwise, we add $\overline{\sigma }$ to $finalE_i$. Also, for every $\overline{\sigma }_b \in finalE_i$ with $\overline{\sigma }\lesssim \overline{\sigma }_b$, we add from $BasisTrans(E_{i-1})$ transitions that start in $\overline{\sigma }_b$ to visitedTrans after replacing $\overline{\sigma }_b$ with $\overline{\sigma }$, and we remove $\overline{\sigma }_b$ from $finalE_i$.

5.4 Correctness

Algorithm 1 is sound, complete, and terminates when the TA is restricted as in [8, 43, 44, 46, 47]. Soundness follows directly from encoding non-spuriousness into the constraint $Concretize(\mathsf {\bar{\pi }}) \wedge Cover(\mathsf {\bar{\pi }})$, as defined in Sect. 4.

Corollary 2

(Soundness). Algorithm 1 is sound. That is, if the algorithm computes a non-spurious error path, then there is a configuration in $\uparrow ERR$ that is reachable in $\mathsf {CS(A)}$.

With Lemma 1, the following lemma proves the algorithm’s completeness.

Lemma 5

(Completeness). If $\mathsf {ACS(\overline{A})}$ has a non-spurious error path $\mathsf {\bar{\pi }}$, then Algorithm 1 will find it.

Termination. Since Algorithm 1 is sound and complete, it will terminate whenever $\textsf{CS}$ has a path that covers $\varSigma _{spec}$. However, Lines 21 and 25 of the algorithm could create cycles in errGraph. In the presence of cycles however, the algorithm could compute an infinite sequence of error paths. Therefore, we proved the following theorem.

Theorem 1

If $\forall \overline{r}\in \overline{\mathcal {R}}: \overline{r}.\vec{uv}\in |\mathbb {N}_0|^{|\varGamma |} \wedge \overline{r}.\tau = \emptyset $, then Algorithm 1 terminates.

It is important to note that if errGraph is acyclic, the algorithm is also guaranteed to terminate. This includes the case when errGraph is empty, i.e., the corresponding $\textsf{ACS}$ has neither spurious nor non-spurious error paths. The algorithm also terminates if no cycle in errGraph has decrements or resets.

6 Reachability via (0, 1)-Abstraction

Two configurations $\sigma , \sigma '$ may be comparable with respect to order $\lesssim $ even if some location l is occupied in $\sigma $ while it is not occupied in $\sigma '$. This implies that an algorithm based on upward-closed sets wrt. $\lesssim $ cannot be used to decide reachability properties. Furthermore, note that coverability and reachability specifications (Definition 6 and 7) are agnostic to the precise number of processes in each location, i.e., they require only to distinguish between locations that are occupied by one or more processes and those that are not occupied.

To enable reachability checking and to enhance the performance of our algorithm for (non-general) coverability checking, we introduce a second, similar abstraction of our system model as in [39], where each counter can only assume one of two values: 1 to indicate that the location is currently occupied by at least one process; and 0 to indicate that it is not occupied.

(0, 1)-Configuration. Given an abstract threshold automaton $\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,$ $\overline{\mathcal {R}})$, a (0, 1)-configuration is a tuple ${\sigma ^{z}}= (\vec{k}^{z}, \overline{\vec{g}})$, where $\vec{k}^{z}\in \mathbb {B}^{\left| {L} \right| }$, and $\overline{\vec{g}}$ is defined as before. That is $\vec{k}^{z}[i]$ indicates the presence (1) or absence (0) of at least one process at location i.

Definition 8

A (0, 1)-counter system (or $\textsf{ZCS}$) of $\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})$, is a transition system $\mathsf {ZCS(\overline{A})}= (\varSigma ^z, \varSigma ^z_0, \mathcal {T}^{z})$, where:

$\varSigma ^z= \mathbb {B}^{\left| {L} \right| } \times \mathcal {D}^{|\overline{\varGamma }|}$, is the set of (0, 1)-configurations
$\varSigma ^z_0 \subseteq \varSigma ^z$ is the set of (0, 1)-configurations ${\sigma ^{z}}$ that satisfy the following:
- $\forall i \in \varGamma : \,\, {\sigma ^{z}}.\overline{\vec{g}}[i] = I_0$
- $\forall i \in L: \,\, {\sigma ^{z}}.\vec{k}^{z}[i]=1 \implies i \in \mathcal {I}$
the transition relation $\mathcal {T}^{z}$ is the set of transitions $({\sigma ^{z}}, \overline{r}, {\sigma ^{z}}')$ with:
- $\overline{r}=\{\textit{from},\rightarrow ,\overline{\varphi },\vec{uv}\} \in \overline{\mathcal {R}}$
- ${\sigma ^{z}}.\overline{\vec{g}}\models \overline{r}.\overline{\varphi }$
- ${\sigma ^{z}}.\vec{k}^{z}[\overline{r}.\textit{from}] = 1$ and $({\sigma ^{z}}'.\vec{k}^{z}[\overline{r}.\textit{from}] = 0$ or ${\sigma ^{z}}'.\vec{k}^{z}[\overline{r}.\textit{from}] = 1)$
- ${\sigma ^{z}}'.\vec{k}^{z}[\overline{r}.\rightarrow ] = 1$
- ${\sigma ^{z}}'.\overline{\vec{g}}= {\sigma ^{z}}.\overline{\vec{g}}\dot{+} \vec{uv}$
- $\forall x_i \in \overline{r}.\tau : \,\, {\sigma ^{z}}'.\overline{\vec{g}}[i] = I_0$

Paths. A sequence ${\sigma ^{z}}_0, \overline{r}_0, {\sigma ^{z}}_1,\ldots ,{\sigma ^{z}}_{k-1},\overline{r}_{k-1}, {\sigma ^{z}}_k$ of alternating (0, 1)-configur-ations and abstract rules is a path of $\mathsf {ZCS(\overline{A})}$ if $\forall i < k$ we have $({\sigma ^{z}}_{i},\overline{r}_i,{\sigma ^{z}}_{i+1}) \in \mathcal {T}^{z}$. We denote by $Paths(\mathsf {ZCS(\overline{A})})$ the set of all paths of $\mathsf {ZCS(\overline{A})}$.

We say that a 01-configuration ${\sigma ^{z}}$ satisfies a reachability specification $L_{spec}$, denoted ${\sigma ^{z}}\models L_{spec}$, if for all $i \in L_{=0}$, ${\sigma ^{z}}.\vec{k}^z[i]=0$, and for all $i \in L_{>0}$, ${\sigma ^{z}}.\vec{k}^z[i]>0$. We say that $\mathsf {ZCS(\overline{A})}$ satisfies $L_{spec}$, denoted $\mathsf {ZCS(\overline{A})}\models L_{spec}$, if there is a non-spurious path of $\mathsf {ZCS(\overline{A})}$ that ends in ${\sigma ^{z}}$ with ${\sigma ^{z}}\models L_{spec}$.

Together with Lemma 2, the following lemma shows that our (0, 1)-abstraction is precise for reachability and coverability specifications. In other words, we have $\mathsf {ZCS(\overline{A})}\models L_{spec}$ if and only if $\mathsf {CS(A)}\models L_{spec}$, given that $L_{spec}$ is a reachability or coverability specification.

Lemma 6

Let $\overline{A}=(L, \mathcal {I}, \overline{\varGamma }, \varPi ,\overline{\mathcal {R}})$ be an abstract TA, and $L_{spec}$ a reachability or coverability specification. We assume w.l.o.g. that $\mathcal {I}= \{l_0\}$.

Then, there exists a path ${\sigma ^{z}}_0, \overline{r}_0, {\sigma ^{z}}_1, \ldots , \overline{r}_{n-1}, {\sigma ^{z}}_n \in Paths(\mathsf {ZCS(\overline{A})})$ such that ${\sigma ^{z}}_n \models L_{spec}$ if and only if there exists a path $\overline{\sigma }_0, \overline{r}_0, \overline{\sigma }_1, \ldots , \overline{r}_{m-1}, \overline{\sigma }_m \in Paths(\mathsf {ACS(\overline{A})})$ such that $\overline{\sigma }_m \models L_{spec}$.

6.1 Parameterized Reachability Algorithm (PRA)

Our algorithm PRA for solving the parameterized reachability problem accepts two input parameters: a 01-counter system $\mathsf {ZCS(\overline{A})}$ and a finite set of error configurations $ERR \subseteq \varSigma ^z$. It outputs either “The system is safe” to indicate that no (0, 1)-configuration in ERR is reachable, or an error path of $\mathsf {ZCS(\overline{A})}$. PRA is very similar to Algorithm 1, differing primarily in the approach to computing the predecessor set. Instead of checking if two configurations are comparable, we look for equality. Moreover, in contrast to Algorithm 1, checking whether an error path $\mathsf {\pi ^z}$ in $\mathsf {ZCS(\overline{A})}$ is spurious is based on constraint $Reach(\mathsf {\pi ^z})$ instead of $Cover(\mathsf {\pi ^z})$, as described in Sect. 4. More details on PRA can be found in [10].

Just like Algorithm 1, we have demonstrated soundness and completeness of PRA in a general context, along with termination, subject to the restrictions outlined in Theorem 1. Moreover, with minimal adjustments to the reachability algorithm, we can also check coverability within a $\textsf{ZCS}$ (see again [10]).

7 Implementation and Experimental Evaluation

We implemented Algorithm 1 with explicit, unbounded integer counters and PRA symbolically, leveraging CUDD Decision Diagrams [55] as BDD package. Both use Z3 [25] as SMT solver back-end. We evaluated our implementations on an AMD Ryzen 7 5800X CPU, running at 3.8 GHz with 32 GiB memory of system memory. For comparisons with ByMC [46], the state-of-the-art model checker for threshold automata, we executed the tool in the VM provided by the authors on the same machine, with 6 out of 8 cores and 25GiB of the total system memory. To ensure an equal environment, we also executed our tool in a virtual machine with the same restrictions.^{Footnote 4}

Table 1. Comparison of execution time (in s) between ByMC [46] and our symbolic implementation of PRA. In the Prop. column, A denotes agreement, V denotes validity, and U denotes unforgeability. TO denotes a timeout after 1.5 h.

Full size table

As benchmarks in our decidable fragment, we used the following threshold-based algorithms from the literature [42]: folklore reliable broadcast (frb) [20], one-step consensus with zero-degradation (cf1s) [18], consensus in one communication step (c1cs) [19], consistent broadcast (strb), asynchronous byzantine agreement (aba) [17], non-blocking atomic commit (nbacr [53] and nbacg [34]), condition-based consensus (cc) [50], and byzantine one step consensus (bosco) [56]. Moreover, we considered the following parts of the Red Belly blockchain [22], which have already been modeled as TA in [12]: the broadcast protocol (RB-bc), the one-round consensus protocol (RB), and a simplified one-round consensus protocol (RB-Simpl).

We verified interesting safety specifications: agreement (consistent decisions among correct processes), validity (the value that has been decided must have been proposed by some process), and unforgeability (if all correct processes have an initial value of 0, then no correct process ever accepts).

Table 1 compares execution times of ByMC and our implementation of PRA. It shows that PRA significantly outperforms ByMC on all benchmarks except two. Except for Red Belly protocols, our non-symbolic implementation can verify all the benchmarks above in less than 2 min.

For the undecidable fragment, we used our extended model of TA to model the following FTDAs from the literature: reliable broadcast [57], k-set agreement [21], multi-round simplified Red Belly blockchain consensus [22], multi-round full Red Belly blockchain consensus [22], and phase king consensus [11]. Note that, in [12], only the one-round protocols were modeled, and only the simple version was verified. In contrast, we can model the multi-round versions, and were able to verify all (single- and multi-round) versions except one.

Table 2. Execution time (in s) for PRA on multi-round protocols. TO denotes a timeout after 2 h.

Full size table

For the aforementioned TAs, we conducted our experiments on a machine with 2x AMD EPYC 7773x - 128 Cores, 256 Threads and 2TB RAM. Table 2 presents execution for running our symbolic implementation on all the aforementioned multi-round protocols. For the phase-king protocol, PRA was able to locate a bug in an incorrect model of the algorithm within 2 minutes, and was also able to prove partial correctness properties, for example that consensus is actually reachable, in around 5 min. We verified validity for all the remaining benchmarks, and for floodmin we verified additionally agreement. During benchmarking, memory usage peaked at 73 GB.

8 Related Work

After presenting an approach for verifying FTDAs based on abstraction [40], Konnov et al. [43,44,45,46] developed several approaches and algorithms specifically tailored for verifying the safety and liveness properties of threshold automata. Starting from an acyclic CFA (control flow automaton), they construct in [40] a finite counter system using parametric interval abstraction for variables and counters. To refine the abstraction, they detect spurious transitions using ordinary model checking and user-defined invariants. Subsequent works [43, 45] improve on the efficiency when verifying reachability properties in TAs. In [45], they showed the existence of an upper bound on the distance between states within counter systems of TAs, hence, demonstrating the completeness of bounded model checking. In [43], the authors use partial order reduction to generate a finite set of sequences comprising sets of guards and rules. Each of these sequences represents a possibly infinite set of error traces. An SMT solver is employed to validate the existence of concrete error traces. Extending [43], Konnov et al. presented an approach capable of detecting lasso-shaped traces that violate a given liveness property [44]. The latter two approaches have been implemented in the tool ByMC [46]. To enable the functionality of all aforementioned approaches, the authors found it necessary to impose constraints on threshold automata. This involved explicitly prohibiting cycles and variable decrements.

A verification tool for parameterized distributed algorithms has been introduced in [59]. The tool relies on layered threshold automata [14] as a system model, which can be seen as an infinitely repeating threshold automata. However, the tool requires users to identify layers (rounds) in the model, their sequence (infinite interleaving or lasso-shaped sequences), and to provide predicates.

Decidability and the complexity of verification and synthesis of threshold automata have also been studied in [8]. Their decision procedure is based on an SMT encoding of potential error paths, where in general the size of the SMT formula grows exponentially with the length of the paths. While having achieved good results with some heuristics that avoid this exponential blow-up in practice, we believe that these heuristics would not work for threshold automata with decrements and/or resets^{Footnote 5}. Moreover, their method requires a bound on the number of changes in the valuation of thresholds. However, such a constraint does not apply in the presence of decrements and/or resets.

9 Conclusion

In this paper, we have introduced an extension of the computational model known as threshold automata, to support decrements and resets of shared variables. This extension in general comes at the cost of decidability, even for simple state-reachability properties. We developed a semi-decision procedure for this extended notion of TA, supporting not only the simple coverability properties from the TA literature, but also general coverability properties as known from Petri nets or well-structured systems. To support also reachability properties, we presented an additional abstraction, called (0, 1)-abstraction, which is the basis for a semi-decision procedure for reachability properties of extended TAs.

We have implemented our techniques and evaluated them on examples from the literature, and on several round-based algorithms that cannot be modeled with canonical TAs. We show that our semi-decision procedure can find bugs in a faulty protocol and prove correctness of protocols, even outside the known decidable fragment. Moreover, on a set of benchmarks in the decidable fragment, it matches or outperforms the TA model checker ByMC [46].

Notes

1.
This is needed only in cases where an update affects any of the guards of $r_i.\varphi $.
2.
Formal proofs of all our results can be found in the extended version of the paper [10].
3.
A cycle is a sub-sequence of a path that starts and ends in the same configuration.
4.
The benchmark files and a container image with our tool are available on Zenodo [9].
5.
We could not verify this conjecture since their code is not publicly available.

References

Abdulla, P.A., Cerans, K., Jonsson, B., Tsay, Y.K.: General decidability theorems for infinite-state systems. In: Proceedings 11th Annual IEEE Symposium on Logic in Computer Science, pp. 313–321. IEEE (1996)
Google Scholar
Abdulla, P.A., Haziza, F., Holík, L.: Parameterized verification through view abstraction. Int. J. Softw. Tools Technol. Transf. 18(5), 495–516 (2016). https://doi.org/10.1007/S10009-015-0406-X
Article Google Scholar
Aminof, B., Jacobs, S., Khalimov, A., Rubin, S.: Parameterized model checking of token-passing systems. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 262–281. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54013-4_15
Chapter Google Scholar
Aminof, B., Kotek, T., Rubin, S., Spegni, F., Veith, H.: Parameterized model checking of rendezvous systems. In: Baldan, P., Gorla, D. (eds.) CONCUR 2014. LNCS, vol. 8704, pp. 109–124. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44584-6_9
Chapter Google Scholar
Außerlechner, S., Jacobs, S., Khalimov, A.: Tight cutoffs for guarded protocols with fairness. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016. LNCS, vol. 9583, pp. 476–494. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49122-5_23
Chapter Google Scholar
Balasubramanian, A.R., Bertrand, N., Markey, N.: Parameterized verification of synchronization in constrained reconfigurable broadcast networks. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 38–54. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-89963-3_3
Chapter Google Scholar
Balasubramanian, A.R., Guillou, L., Weil-Kennedy, C.: Parameterized analysis of reconfigurable broadcast networks. In: FoSSaCS 2022. LNCS, vol. 13242, pp. 61–80. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99253-8_4
Chapter Google Scholar
Balasubramanian, A.R., Esparza, J., Lazić, M.: Complexity of verification and synthesis of threshold automata. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 144–160. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_8
Chapter Google Scholar
Baumeister, T., Eichler, P., Jacobs, S., Sakr, M., Völp, M.: Parameterized verification of round-based distributed algorithms via extended threshold automata -. Artifact (2024). https://doi.org/10.5281/zenodo.12513748
Article Google Scholar
Baumeister, T., Eichler, P., Jacobs, S., Sakr, M., Völp, M.: Parameterized verification of round-based distributed algorithms via extended threshold automata (2024). https://arxiv.org/abs/2406.19880
Berman, P., Garay, J.A., Perry, K.J., et al.: Towards optimal distributed consensus. In: FOCS, vol. 89, pp. 410–415 (1989)
Google Scholar
Bertrand, N., Gramoli, V., Konnov, I., Lazic, M., Tholoniat, P., Widder, J.: Holistic verification of blockchain consensus. In: DISC. LIPIcs, vol. 246, pp. 10:1–10:24. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)
Google Scholar
Bertrand, N., Thomas, B., Widder, J.: Guard automata for the verification of safety and liveness of distributed algorithms. In: Haddad, S., Varacca, D. (eds.) 32nd International Conference on Concurrency Theory, CONCUR 2021, August 24–27, 2021, Virtual Conference. LIPIcs, vol. 203, pp. 15:1–15:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021). https://doi.org/10.4230/LIPICS.CONCUR.2021.15
Bertrand, N., Thomas, B., Widder, J.: Guard automata for the verification of safety and liveness of distributed algorithms. In: Concur 2021-International Conference on Concurrency Theory, pp. 1–17 (2021)
Google Scholar
Bloem, R., et al.: Decidability of Parameterized Verification. Synthesis Lectures on Distributed Computing Theory, Morgan & Claypool Publishers (2015). https://doi.org/10.2200/S00658ED1V01Y201508DCT013
Bouajjani, A., Jonsson, B., Nilsson, M., Touili, T.: Regular model checking. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 403–418. Springer, Heidelberg (2000). https://doi.org/10.1007/10722167_31
Chapter Google Scholar
Bracha, G., Toueg, S.: Asynchronous consensus and broadcast protocols. J. ACM (JACM) 32(4), 824–840 (1985)
Article MathSciNet Google Scholar
Brasileiro, F., Greve, F., Mostefaoui, A., Raynal, M.: Consensus in one communication step. In: Malyshkin, V. (ed.) PaCT 2001. LNCS, vol. 2127, pp. 42–50. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44743-1_4
Chapter Google Scholar
Brasileiro, F., Greve, F., Mostefaoui, A., Raynal, M.: Consensus in one communication step. In: Malyshkin, V. (ed.) PaCT 2001. LNCS, vol. 2127, pp. 42–50. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44743-1_4
Chapter Google Scholar
Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. J. ACM (JACM) 43(2), 225–267 (1996)
Article MathSciNet Google Scholar
Chaudhuri, S., Erlihy, M., Lynch, N.A., Tuttle, M.R.: Tight bounds for k-set agreement. J. ACM (JACM) 47(5), 912–943 (2000)
Article MathSciNet Google Scholar
Crain, T., Gramoli, V., Larrea, M., Raynal, M.: DBFT: efficient leaderless byzantine consensus and its application to blockchains. In: 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA), pp. 1–8. IEEE (2018)
Google Scholar
Crain, T., Natoli, C., Gramoli, V.: Red belly: a secure, fair and scalable open blockchain. In: SP, pp. 466–483. IEEE (2021)
Google Scholar
Czerwinski, W., Orlikowski, L.: Reachability in vector addition systems is Ackermann-complete. In: 62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022. pp. 1229–1240. IEEE (2021). https://doi.org/10.1109/FOCS52979.2021.00120
de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Chapter Google Scholar
Delzanno, G., Sangnier, A., Zavattaro, G.: Parameterized verification of ad hoc networks. In: Gastin, P., Laroussinie, F. (eds.) CONCUR 2010. LNCS, vol. 6269, pp. 313–327. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15375-4_22
Chapter Google Scholar
Emerson, E.A., Kahlon, V.: Model checking guarded protocols. In: LICS, pp. 361–370. IEEE Computer Society (2003). https://doi.org/10.1109/LICS.2003.1210076
Emerson, E.A., Namjoshi, K.S.: On reasoning about rings. Found. Comput. Sci. 14(4), 527–549 (2003). https://doi.org/10.1142/S0129054103001881
Article MathSciNet Google Scholar
Emerson, E.A., Kahlon, V.: Reducing model checking of the many to the few. In: McAllester, D. (ed.) CADE 2000. LNCS (LNAI), vol. 1831, pp. 236–254. Springer, Heidelberg (2000). https://doi.org/10.1007/10721959_19
Chapter Google Scholar
Esparza, J., Finkel, A., Mayr, R.: On the verification of broadcast protocols. In: LICS, pp. 352–359. IEEE Computer Society (1999). https://doi.org/10.1109/LICS.1999.782630
Finkel, A.: The minimal coverability graph for Petri nets. In: Rozenberg, G. (ed.) ICATPN 1991. LNCS, vol. 674, pp. 210–243. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-56689-9_45
Chapter Google Scholar
Finkel, A., Schnoebelen, P.: Well-structured transition systems everywhere! Theoret. Comput. Sci. 256(1–2), 63–92 (2001)
Article MathSciNet Google Scholar
German, S.M., Sistla, A.P.: Reasoning about systems with many processes. J. ACM 39(3), 675–735 (1992). https://doi.org/10.1145/146637.146681
Article MathSciNet Google Scholar
Guerraoui, R.: Non-blocking atomic commit in asynchronous distributed systems with failure detectors. Distrib. Comput. 15(1), 17–25 (2002)
Article Google Scholar
Hawblitzel, C., et al.: Ironfleet: proving safety and liveness of practical distributed systems. Commun. ACM 60(7), 83–92 (2017). https://doi.org/10.1145/3068608
Jaber, N., Jacobs, S., Wagner, C., Kulkarni, M., Samanta, R.: Parameterized verification of systems with global synchronization and guards. In: Lahiri, S.K., Wang, C. (eds.) CAV 2020. LNCS, vol. 12224, pp. 299–323. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_15
Chapter Google Scholar
Jaber, N., Wagner, C., Jacobs, S., Kulkarni, M., Samanta, R.: Quicksilver: modeling and parameterized verification for distributed agreement-based systems. Proc. ACM Program. Lang. 5(OOPSLA), 1–31 (2021). https://doi.org/10.1145/3485534
Jacobs, S., Sakr, M.: Analyzing guarded protocols: better cutoffs, more systems, more expressivity. In: VMCAI 2018. LNCS, vol. 10747, pp. 247–268. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73721-8_12
Chapter Google Scholar
Jacobs, S., Sakr, M., Völp, M.: Automatic repair and deadlock detection for parameterized systems. In: Conference on Formal Methods in Computer-Aided Design–FMCAD 2022, p. 225 (2022)
Google Scholar
John, A., Konnov, I., Schmid, U., Veith, H., Widder, J.: Parameterized model checking of fault-tolerant distributed algorithms by abstraction. In: 2013 Formal Methods in Computer-Aided Design. pp. 201–209. IEEE (2013)
Google Scholar
Kaiser, A., Kroening, D., Wahl, T.: Dynamic cutoff detection in parameterized concurrent programs. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 645–659. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14295-6_55
Chapter Google Scholar
Konnov, I.: Fault-tolerant benchmarks. https://github.com/konnov/fault-tolerant-benchmarks/tree/master/cav15
Konnov, I., Lazić, M., Veith, H., Widder, J.: Para 2: parameterized path reduction, acceleration, and SMT for reachability in threshold-guarded distributed algorithms. Formal Meth. Syst. Des. 51(2), 270–307 (2017)
Article Google Scholar
Konnov, I., Lazić, M., Veith, H., Widder, J.: A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithms. In: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, pp. 719–734 (2017)
Google Scholar
Konnov, I., Veith, H., Widder, J.: On the completeness of bounded model checking for threshold-based distributed algorithms: reachability. Inf. Comput. 252, 95–109 (2017)
Article MathSciNet Google Scholar
Konnov, I., Widder, J.: ByMC: byzantine model checker. In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11246, pp. 327–342. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03424-5_22
Chapter Google Scholar
Kukovec, J., Konnov, I., Widder, J.: Reachability in parameterized systems: all flavors of threshold automata. In: CONCUR 2018-29th International Conference on Concurrency Theory (2018)
Google Scholar
Marić, O., Sprenger, C., Basin, D.: Cutoff bounds for consensus algorithms. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 217–237. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_12
Chapter Google Scholar
McMillan, K.L., Padon, O.: Ivy: a multi-modal verification tool for distributed algorithms. In: Lahiri, S.K., Wang, C. (eds.) CAV 2020. LNCS, vol. 12225, pp. 190–202. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53291-8_12
Chapter Google Scholar
Mostéfaoui, A., Mourgaya, E., Parvédy, P.R., Raynal, M.: Evaluating the condition-based approach to solve consensus. In: 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings, pp. 541–541. IEEE Computer Society (2003)
Google Scholar
Rackoff, C.: The covering and boundedness problems for vector addition systems. Theoret. Comput. Sci. 6(2), 223–231 (1978)
Article MathSciNet Google Scholar
Rahli, V., Guaspari, D., Bickford, M., Constable, R.L.: Formal specification, verification, and implementation of fault-tolerant systems using eventml. Electron. Commun. Eur. Assoc. Softw. Sci. Technol. 72 (2015). https://doi.org/10.14279/TUJ.ECEASST.72.1013
Raynal, M.: A case study of agreement problems in distributed systems: non-blocking atomic commitment. In: Proceedings 1997 High-Assurance Engineering Workshop, pp. 209–214. IEEE (1997)
Google Scholar
Schmitz, S., Schnoebelen, P.: The power of well-structured systems. In: D’Argenio, P.R., Melgratti, H. (eds.) CONCUR 2013. LNCS, vol. 8052, pp. 5–24. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40184-8_2
Chapter Google Scholar
Somenzi, F.: CUDD: cu decision diagram package release 2.3. 0. University of Colorado at Boulder 621 (1998)
Google Scholar
Song, Y.J., van Renesse, R.: Bosco: one-step byzantine asynchronous consensus. In: Taubenfeld, G. (ed.) DISC 2008. LNCS, vol. 5218, pp. 438–450. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87779-0_30
Chapter Google Scholar
Srikanth, T., Toueg, S.: Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distrib. Comput. 2(2), 80–94 (1987)
Article Google Scholar
Suzuki, I.: Proving properties of a ring of finite-state machines. Inf. Process. Lett. 28(4), 213–214 (1988). https://doi.org/10.1016/0020-0190(88)90211-6
Article MathSciNet Google Scholar
Thomas, B., Sankur, O.: Pylta: a verification tool for parameterized distributed algorithms. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 28–35. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-30820-8_4
Wilcox, J.R., et al.: Verdi: a framework for implementing and formally verifying distributed systems. In: Grove, D., Blackburn, S.M. (eds.) Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15–17, 2015, pp. 357–368. ACM (2015). https://doi.org/10.1145/2737924.2737958

Download references

Acknowledgments

T. Baumeister and P. Eichler carried out this work as members of the Saarbrücken Graduate School of Computer Science. This research was funded in whole or in part by the German Research Foundation (DFG) grant 513487900 and the Luxembourg National Research Fund (FNR) grant C22/IS/17432184. For the purpose of open access, and in fulfilment of the obligations arising from the grant agreement, the author has applied a Creative Commons Attribution 4.0 International (CC BY 4.0) license to any Author Accepted Manuscript version arising from this submission.

Author information

Authors and Affiliations

CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Tom Baumeister, Paul Eichler & Swen Jacobs
SnT, Luxembourg University, Esch-sur-Alzette, Luxembourg
Mouhammad Sakr & Marcus Völp

Authors

Tom Baumeister
View author publications
You can also search for this author in PubMed Google Scholar
Paul Eichler
View author publications
You can also search for this author in PubMed Google Scholar
Swen Jacobs
View author publications
You can also search for this author in PubMed Google Scholar
Mouhammad Sakr
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Völp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tom Baumeister .

Editor information

Editors and Affiliations

Karlsruhe Institute of Technology, Karlsruhe, Germany
André Platzer
Iowa State University, Ames, IA, USA
Kristin Yvonne Rozier
Politecnico di Milano, Milan, Italy
Matteo Pradella
Politecnico di Milano, Milan, Italy
Matteo Rossi

Ethics declarations

Data Availability

The program, benchmark scripts and the benchmark files as evaluated in Sect. 7 are available at https://doi.org/10.5281/zenodo.12527556.

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Baumeister, T., Eichler, P., Jacobs, S., Sakr, M., Völp, M. (2025). Parameterized Verification of Round-Based Distributed Algorithms via Extended Threshold Automata. In: Platzer, A., Rozier, K.Y., Pradella, M., Rossi, M. (eds) Formal Methods. FM 2024. Lecture Notes in Computer Science, vol 14933. Springer, Cham. https://doi.org/10.1007/978-3-031-71162-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-71162-6_33
Published: 11 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71161-9
Online ISBN: 978-3-031-71162-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Parameterized Verification of Round-Based Distributed Algorithms via Extended Threshold Automata