1 Introduction

In recent work [28] we applied bounded model checking to verify reachability properties of threshold-based fault-tolerant distributed algorithms (FTDA), which are parameterized in the number of processes n and the fraction of faults t, e.g., \(n>3t\). Moreover, we showed how to make bounded model checking complete in the parameterized case. In particular, we considered counter systems where we record for each local state, how many processes are in this state. We have one counter per local state \(\ell \), denoted by \(\kappa [\ell ]\). A process step from local state \(\ell \) to local state \(\ell '\) is modeled by decrementing \(\kappa [\ell ]\) and incrementing \(\kappa [\ell ']\). When \(\delta \) processes perform the same step one after the other, we allow the processes to do the accelerated step that instantaneously changes two counters by \(\delta \). The number \(\delta \) is called acceleration factor, it can vary within a single run.

As we focus on FTDAs, we consider specific counter systems, namely those defined by threshold automata. Here, transitions are guarded by threshold guards that compare a shared integer variable to a linear combination of parameters, e.g., \(x\ge n-t\) or \(x<t\), where x is a shared variable and n and t are parameters.

Completeness of the method [28] with respect to reachability is shown by proving a bound on the diameter of the accelerated system. Inspired by Lamport’s view of distributed computation as partial order on events [30], our method is in essence an offline partial order reduction. Instead of pruning executions that are “similar” to ones explored before [22, 38, 43], we use the partial order to show (offline) that every run has a similar run of bounded length. Interestingly, the bound is independent of the parameters. In combination with the data abstraction of [25], we obtained the following automated method [28]:

  1. 1.

    Apply a parametric data abstraction to the process code to get a finite state process description, and construct the threshold automaton (TA) [25, 27].

  2. 2.

    Compute the diameter bound, based on the control flow of the TA.

  3. 3.

    Construct a system with abstract counters, i.e., a counter abstraction [25, 39].

  4. 4.

    Perform SAT-based bounded model checking [7, 16] up to the diameter bound, to check whether bad states are reached in the counter abstraction.

  5. 5.

    If a counterexample is found, check its feasibility and refine, if needed [13, 25].

While this allowed us to automatically verify several FTDAs not verified before, there remained two bottlenecks for scalability to larger and more complex protocols: First, due to abstraction there were spurious counterexamples. Second, counter abstraction works well in practice only for processes with a few dozens of local states, but it does not scale to hundreds of local states; partly because many different interleavings result in a large search space.

To address these bottlenecks, we make two crucial contributions in this paper: First, to eliminate one of the two sources of spurious counterexamples, namely, the non-determinism added by abstract counters, we do bounded model checking using SMT solvers with linear integer arithmetic on the accelerated system, instead of SAT-based bounded model checking on the counter abstraction.

Second, we reduce the search space dramatically: We introduce the notion of an execution schema that is defined as a sequence of local rules of the TA. By assigning to each rule of a schema an acceleration factor (possibly 0), one obtains a run of the counter system. Hence, each schema represents infinitely many runs. We show how to construct a set of schemas whose set of reachable states coincides with the set of reachable states of the accelerated counter system.

Our construction can be seen as an aggressive partial order reduction, where each run has a similar run generated by a schema from the set. To show this, we capture the guards that are locked and unlocked in a locking context. Our key insight is that a bounded number of transitions changes the context in each run. For example, of all transitions increasing a variable x, at most one makes \(x \ge n-t\) true, and at most one makes \(x<t\) false (the parameters n and t are fixed in a run). We fix those transitions that change the context, and apply the ideas of partial order reduction to the subexecutions between these transitions.

Our experiments show that SMT solvers and schemas outperform SAT solvers and counter abstraction in parameterized verification of threshold-based FTDAs. Indeed, we verified safety of complicated FTDAs [10, 18, 23, 37, 40, 41] that have not been automatically verified before. In addition we achieved dramatic speedup and reduced memory footprint on previously verified FTDAs [9, 12, 42] (cf. [28]).

Fig. 1.
figure 1

An example threshold automaton.

2 A Motivating Example

Figure 1 is an example of a threshold automaton TA over two shared variables x and y and parameters n, t, and f. It is inspired by the distributed asynchronous broadcast protocol from [9], where \(n - f\) correct processes concurrently execute TA, and f processes are Byzantine. As is typical for FTDAs, the parameters must satisfy a resilience condition, e.g., \(n > 3t \wedge t \ge f \ge 0\) stating that less than a third of the processes is faulty. The circles depict the local states \(\ell _1, \dots , \ell _5\), two of them are the initial states \(\ell _1, \ell _2\). The edges depict the rules \(r_1, \dots , r_6\) labeled with guarded commands \(\varphi \mapsto \mathsf {act}\), where \(\varphi \) is one of the threshold guards “\(\varphi _1 :x \ge \lceil (n+t)/2 \rceil - f\)”, “\(\varphi _2 :y \ge (t + 1) - f\)”, and “\(\varphi _3 :y \ge (2t + 1) - f\)”, and an action \(\mathsf {act}\) increases the shared variables x or y by one, or zero (as in rule \(r_6\)).

Every local state \(\ell _i\) has a non-negative counter \({\mathbf {\varvec{\kappa }}}[\ell _i]\) that represents the number of processes in \(\ell _i\). Together with the values of x, y, n, t, and f, the values of the counters constitute a configuration of the system. In the initial configuration there are \(n-f\) processes in initial states, i.e., \({\mathbf {\varvec{\kappa }}}[\ell _1] + {\mathbf {\varvec{\kappa }}}[\ell _2] = n - f\), and the other counters and the shared variables x and y are zero.

The rules define the transitions of the counter system. For instance, according to the rule \(r_2\), if in the current configuration the guard \(y \ge t+1-f\) holds true and \({\mathbf {\varvec{\kappa }}}[\ell _1] \ge 5\), then five processes can instantaneously move out of the local state \(\ell _1\) to the local state \(\ell _3\), and increment x as prescribed by the action of \(r_2\). This results in increase of x and the counter \({\mathbf {\varvec{\kappa }}}[\ell _3]\) by five, and counter \({\mathbf {\varvec{\kappa }}}[\ell _1]\) is decreased by five. When, as in this example, rule \(r_2\) is conceptually executed by 5 processes, we denote this transition by \(r^5_2\).

We now consider the runs more closely. As initially x and y are zero, threshold guards \(\varphi _1\), \(\varphi _2\), and \(\varphi _3\) evaluate to false. As rules may only increase variables, these guards may eventually become true. (In this example we do not consider guards like \(x<t\) that are initially true and become false, although we treat them later.) In fact, initially only \(r_1\) is unlocked. Because \(r_1\) increases x, it may unlock \(\varphi _1\). Thus \(r_4\) becomes unlocked. Rule \(r_4\) increases y and thus repeated execution of \(r_4\) (by different processes) first unlocks \(\varphi _2\) and then \(\varphi _3\). We call the set of conditions which evaluate to true in a configuration the context. For our example we observe that each run goes through the following sequence of contexts \(\{\}\), \(\{\varphi _1\}\), \(\{\varphi _1, \varphi _2\}\), and \(\{\varphi _1, \varphi _2, \varphi _3\}\). In fact, the sequence of contexts in an execution of a TA is always monotonic.

The conjunction of the guards in the context \(\{\varphi _1, \varphi _2\}\) implies the guards of the rules \(r_1, r_2, r_3, r_4, r_5\); we say they are unlocked in the context. A technical challenge addressed in this paper is to show that a fixed sequence of these rules can “capture” all schedules allowed in this context. To this end we analyze the control flow of the TA. Our example is acyclic up to self-loops, and thus the control flow establishes a partial order on the rules. We show in this paper that we can use any linear extension of this partial order as the required fixed sequence, e.g., \(r_1 < r_2 < r_3 < r_4 < r_5\). (In this example we do not deal with loops, although we handle them in Sect. 4.1.) It remains to deal with transitions that actually change the context. In our example, only \(r_4\) or \(r_5\) can change the context from \(\{\varphi _1, \varphi _2\}\) to \(\{\varphi _1, \varphi _2, \varphi _3 \}\). Therefore we append \(r_4, r_5\) —that change the context —to the fixed sequence for the context. Thus, we obtain the following schema, where inside the curly brackets we give the contexts, and between two contexts the fixed sequences of rules. (We discuss the underlined rules below.)

We now show how each schedule is captured by schema S. Consider, e.g., a schedule from the initial state \(\sigma _0\) with \(n=5\), \(t=f=1\), \({\mathbf {\varvec{\kappa }}}[\ell _1] = 1\), and \({\mathbf {\varvec{\kappa }}}[\ell _2] = 3\). We are interested in whether there is a schedule that reaches a configuration, where all processes are in state \(\ell _5\). Consider the following schedule:

Observe that after \(r^1_1,r^1_1\), variable \(x=2\) and \(\varphi _1\) is true. Hence transition \(t_1\) changes the context from \(\{\}\) to \(\{\varphi _1\}\). Similarly \(t_2\) and \(t_3\) change the context. Context changing transitions are marked with curly brackets. Between them we have the subschedules \(\tau _1, \dots , \tau _4\) (\(\tau _3\) is empty) marked with square brackets.

To show that this schedule is captured by the schema, we apply partial order arguments regarding distributed computations: As the guards \(\varphi _2\) and \(\varphi _3\) evaluate to true in \(\tau _4\), and \(r_5\) precedes \(r_6\) in the control flow of the TA, all transitions \(r^1_5\) can be moved to the left in \(\tau _4\). Similarly, \(r^1_1\) can be moved to the left in \(\tau _2\). The resulting schedule is applicable and leads to the same configuration as the original one. Further, we can accelerate the adjacent transitions with the same rule, e.g., the sequence \(r^1_5, r^1_5\) can be transformed into \(r^2_5\). Thus, we transform subschedules \(\tau _i\) into \(\tau '_i\), and arrive at the following schedule \(\tau '\) that we call the representative schedule of \(\tau \). Importantly for reachability checking, if \(\tau \) and \(\tau '\) are applied to the same configuration, they end in the same configuration.

Reconsidering schema S, we observe that the sequence of underlined rules in S matches the schedule \(\tau '\). In this paper we show that every schedule can be transformed into a representative schedule that matches one schema from a small set of schemas. Each schema in this set corresponds to one of the monotonic sequences of contexts, and is constructed following the ideas from above. Completeness regarding reachability follows from the fact that each schedule goes through a monotonic sequence of contexts. For each schema, reachability can be expressed by an SMT formula involving both state variables and parameters.

3 Parameterized Counter Systems

We extend the framework of [28]. A threshold automaton describes a process in a concurrent system, and is a tuple  defined below.

States. The finite set \({\mathcal L}\) contains the local states, and \({\mathcal I}\subseteq {\mathcal L}\) is the set of initial states. The finite set \(\varGamma \) contains the shared variables that range over \({\mathbb N}_0\). The finite set \(\varPi \) is a set of parameter variables that range over \({\mathbb N}_0\), and the resilience condition RC is a formula over parameter variables in linear integer arithmetic, e.g., \(n>3t\). The set of admissible parameters is .

Rules. A rule defines a conditional transition between local states that may update the shared variables. Formally, a rule is a tuple \(({ from }, { to }, \varphi ^{\scriptscriptstyle {\le }}, \varphi ^{\scriptscriptstyle {>}}, \mathbf {u})\): The local states \({ from }\) and \({ to }\) are from \({\mathcal L}\), and capture from which local state to which a process moves. A rule is only executed if the conditions \(\varphi ^{\scriptscriptstyle {\le }}\) and \(\varphi ^{\scriptscriptstyle {>}}\) evaluate to true. Each condition is a conjunction of guards. Each guard is defined using some shared variable \(x \in \varGamma \), coefficients \(a_0, \dots , a_{|\varPi |}\in {\mathbb Z}\), and parameter variables \(p_1, \dots , p_{|\varPi |}\in \varPi \) so that \(a_0 + \sum \nolimits _{i=1}^{{|\varPi |}} a_i \cdot p_i\ \le \ x \text{ and } a_0 + \sum \nolimits _{i=1}^{{|\varPi |}} a_i \cdot p_i\ >\ x \) are a lower guard and upper guard, respectively. Let \(\varPhi ^{\scriptscriptstyle {U}}\) and \(\varPhi ^{\scriptscriptstyle {L}}\) be the sets of lower and upper guards. The set \(\mathsf {guard}(\varphi ^{\scriptscriptstyle {\le }}) \subseteq \varPhi ^{\scriptscriptstyle {U}}\) is the set of guards used in \(\varphi ^{\scriptscriptstyle {\le }}\), while the set \(\mathsf {guard}(\varphi ^{\scriptscriptstyle {>}}) \subseteq \varPhi ^{\scriptscriptstyle {L}}\) is the set of guards used in \(\varphi ^{\scriptscriptstyle {>}}\).

Rules may increase shared variables using an update vector \(\mathbf {u}\in {\mathbb N}_0^{|\varGamma |}\) that is added to the vector of shared variables. Finally, \({\mathcal R}\) is the finite set of rules.

Definition 1

Given a threshold automaton , we define the precedence relation \(\prec _{\scriptscriptstyle { P }}\): for a pair of rules \(r_1, r_2 \in {\mathcal R}\), it holds that \(r_1 \prec _{\scriptscriptstyle { P }}r_2\) if and only if \(r_1.{ to }= r_2.{ from }\). We denote by \(\prec ^+_{\scriptscriptstyle { P }}\) the transitive closure of \(\prec _{\scriptscriptstyle { P }}\). Further, we say that \(r_1 \sim _{\scriptscriptstyle { P }}r_2\), if \(r_1 \prec ^+_{\scriptscriptstyle { P }}r_2 \, \wedge \, r_2 \prec ^+_{\scriptscriptstyle { P }}r_1\), or \(r_1 = r_2\).

As in [28], we limit ourselves to threshold automata relevant for FTDAs, namely those where \(r.\mathbf {u}= \mathbf {0}\) for all rules \(r\in {\mathcal R}\) that satisfy \(r \prec ^+_{\scriptscriptstyle { P }}r\).

Looplets. The relation \(\sim _{\scriptscriptstyle { P }}\) defines equivalence classes of rules. An equivalence class corresponds to a loop or a single rule that is not part of a loop. Hence, we use the term looplet for one such equivalence class. For a given set of rules \({\mathcal R}\) let \({\mathcal R}/{\sim }\) be the set of equivalence classes defined by \(\sim _{\scriptscriptstyle { P }}\). We denote by \([{r}]\) the equivalence class of rule r. For two classes \(c_1\) and \(c_2\) from \({\mathcal R}/{\sim }\) we write \(c_1 \prec _{\scriptscriptstyle { C }}c_2\) iff there are two rules \(r_1\) and \(r_2\) in \({\mathcal R}\) satisfying \([{r_1}]=c_1\) and \([{r_2}]=c_2\) and \(r_1 \prec ^+_{\scriptscriptstyle { P }}r_2\) and \(r_1 \not \sim _{\scriptscriptstyle { P }}r_2\). As the relation \(\prec _{\scriptscriptstyle { C }}\) is a strict partial order, there are linear extensions of \(\prec _{\scriptscriptstyle { C }}\). Below, we fix an arbitrary of these linear extensions to sort transitions in a schedule: We denote by \(\prec ^{ lin }_{\scriptscriptstyle { C }}\) a linear extension of \(\prec _{\scriptscriptstyle { C }}\).

3.1 Counter Systems

Given a threshold automaton TA, a function \(N:\mathbf {P}_{RC}\rightarrow {\mathbb N}_0\) that determines the number of processes to be modeled (typically, \(N(n,t,f) = n-f\)) and admissible parameter values \(\mathbf {p}\in \mathbf {P}_{RC}\), we define a counter system as a transition system \((\varSigma ,I,R)\), that consists of the set of configurations \(\varSigma \), which contain the counters and variables, the set of initial configurations \(I\), and the transition relation \(R\):

Configurations \(\varSigma \) and \(I.\) A configuration \(\sigma =({\mathbf {\varvec{\kappa }}},\mathbf {g},\mathbf {p})\) consists of a vector of counter values \(\sigma .{\mathbf {\varvec{\kappa }}}\in {\mathbb N}_0^{|{\mathcal L}|}\) (for simplicity we use the convention that \({\mathcal L}= \{1, \dots , |{\mathcal L}| \}\)) a vector of shared variable values \(\sigma .\mathbf {g}\in {\mathbb N}_0^{|\varGamma |}\), and a vector of parameter values \(\sigma .\mathbf {p}= \mathbf {p}\). The set \(\varSigma \) is the set of all configurations. The set of initial configurations \(I\) contains the configurations that satisfy \(\sigma .\mathbf {g}= \mathbf {0}\), \(\sum _{i \in {\mathcal I}} \sigma .{\mathbf {\varvec{\kappa }}}[i] = N(\mathbf {p})\), and \(\sum _{i \not \in {\mathcal I}} \sigma .{\mathbf {\varvec{\kappa }}}[i] = 0\).

Transition Relation \(R.\) A transition is a pair \(t=({ rule },{ factor })\) of a rule of the \(\mathsf TA \) and a non-negative integer called the acceleration factor, or just factor for short. For a transition \(t=({ rule },{ factor })\) we refer by \(t.\mathbf {u}\) to \({ rule }.\mathbf {u}\), by \(t.\varphi ^{\scriptscriptstyle {>}}\) to \({ rule }.\varphi ^{\scriptscriptstyle {>}}\), etc. We say a transition t is unlocked in configuration \(\sigma \) if \( \forall k \in \{0, \dots , t.{ factor }- 1 \}.\; (\sigma .{\mathbf {\varvec{\kappa }}},\sigma .\mathbf {g}+ k\cdot t.\mathbf {u}, \sigma .\mathbf {p}) \models t.\varphi ^{\scriptscriptstyle {\le }}\wedge t.\varphi ^{\scriptscriptstyle {>}}.\)

A transition t is applicable (or enabled) in configuration \(\sigma \), if it is unlocked, and \(\sigma .{\mathbf {\varvec{\kappa }}}[t.{ from }] \ge t.{ factor }\), or \(t.{ factor }=0\). We say that \(\sigma '\) is the result of applying the enabled transition t to \(\sigma \), and use the notation \(\sigma ' = t(\sigma )\), if

  • \(\sigma '.\mathbf {g}= \sigma .\mathbf {g}+ t.{ factor }\cdot t.\mathbf {u}\) and \(\sigma '.\mathbf {p}= \sigma .\mathbf {p}\)

  • if \(t.{ from }\ne t.{ to }\) then \(\sigma '.{\mathbf {\varvec{\kappa }}}[t.{ from }]=\sigma .{\mathbf {\varvec{\kappa }}}[t.{ from }] - t.{ factor }\) and \(\sigma '.{\mathbf {\varvec{\kappa }}}[t.{ to }]=\sigma .{\mathbf {\varvec{\kappa }}}[t.{ to }] + t.{ factor }\) and \(\forall \ell \in {\mathcal L}\setminus \{t.{ from }, t.{ to }\}.\; \sigma '.{\mathbf {\varvec{\kappa }}}[\ell ]=\sigma .{\mathbf {\varvec{\kappa }}}[\ell ]\)

  • if \(t.{ from }= t.{ to }\) then \(\sigma '.{\mathbf {\varvec{\kappa }}}= \sigma .{\mathbf {\varvec{\kappa }}}\)

The transition relation \(R\subseteq \varSigma \times \varSigma \) of the counter system is defined as follows: \((\sigma , \sigma ') \in R\) iff there is a \(r\in {\mathcal R}\) and a \(k\in {\mathbb N}_0\) such that \(\sigma ' = t(\sigma )\) for \(t=(r,k)\). As updates to shared variables do not decrease their values, we obtain:

Proposition 1

([28]). For all configurations \(\sigma \), all rules r, and all transitions t applicable to \(\sigma \), the following holds:

$$\begin{aligned} \begin{array}{ll} { 1.}~ If \sigma \models r.\varphi ^{\scriptscriptstyle {\le }}then t(\sigma ) \models r.\varphi ^{\scriptscriptstyle {\le }}\qquad \qquad &{} { 3.}~ If \sigma \not \models r.\varphi ^{\scriptscriptstyle {>}}then t(\sigma ) \not \models r.\varphi ^{\scriptscriptstyle {>}}\\ { 2.}~ If t(\sigma ) \not \models r.\varphi ^{\scriptscriptstyle {\le }}then \sigma \not \models r.\varphi ^{\scriptscriptstyle {\le }}&{} { 4.}~ If t(\sigma ) \models r.\varphi ^{\scriptscriptstyle {>}}then \sigma \models r.\varphi ^{\scriptscriptstyle {>}}\end{array} \end{aligned}$$

Schedules and Paths. A schedule is a sequence of transitions. For a schedule \(\tau \) and an index \(i: 1 \le i \le |\tau |\), by t[i] we denote the ith transition of \(\tau \), and by \(\tau ^i\) we denote the prefix \(t[1], \dots , t[i]\) of \(\tau \). A schedule \(\tau = t_1, \dots , t_m\) is applicable to configuration \(\sigma _0\), if there is a sequence of configurations \(\sigma _1,\dots , \sigma _m\) with \(\sigma _i = t_{i} (\sigma _{i-1})\) for \(1 \le i \le m\). If there is a \(t_i.{ factor }>1\), then a schedule is accelerated.

By \(\tau \cdot \tau '\) we denote the concatenation of two schedules \(\tau \) and \(\tau '\). A sequence \(\sigma _0, t_1, \sigma _1, \dots , \sigma _{k-1}, t_k, \sigma _k\) of alternating configurations and transitions is called a (finite) path, if transition \(t_i\) is enabled in \(\sigma _i\) and \(\sigma _i = t_i(\sigma _{i-1})\), for \(1 \le i \le k\). For a configuration \(\sigma _0\) and a schedule \(\tau \) applicable to \(\sigma \), by \(\mathsf path (\sigma _0, \tau )\) we denote the path \(\sigma _0, t_1, \dots , t_{|\tau |}, \sigma _{|\tau |}\) with \(t_i = \tau [i]\) and \(\sigma _i = t_i(\sigma _{i-1})\), for \(1 \le i \le |\tau |\).

3.2 Contexts and Slices

The evaluation of the guards in the sets \(\varPhi ^{\scriptscriptstyle {U}}\) and \(\varPhi ^{\scriptscriptstyle {L}}\) solely defines whether certain rules are unlocked. Due to Proposition 1, we infer that when the transitions of a schedule are applied, more and more guards from \(\varPhi ^{\scriptscriptstyle {U}}\) become unlocked and more and more guards from \(\varPhi ^{\scriptscriptstyle {L}}\) become locked. To capture this, we define:

Definition 2

A context is a pair \((\varOmega ^{\scriptscriptstyle {U}}, \Omega ^{\scriptscriptstyle {L}})\) of subsets \(\varOmega ^{\scriptscriptstyle {U}}\subseteq \varPhi ^{\scriptscriptstyle {U}}\) and \(\Omega ^{\scriptscriptstyle {L}}\subseteq \varPhi ^{\scriptscriptstyle {L}}\). We denote by \(\varOmega \) the pair \((\varOmega ^{\scriptscriptstyle {U}}, \Omega ^{\scriptscriptstyle {L}})\), and by \(|\varOmega | = |\varOmega ^{\scriptscriptstyle {U}}| + |\Omega ^{\scriptscriptstyle {L}}|\).

For two contexts \((\varOmega ^{\scriptscriptstyle {U}}_1, \Omega ^{\scriptscriptstyle {L}}_1)\) and \((\varOmega ^{\scriptscriptstyle {U}}_2, \Omega ^{\scriptscriptstyle {L}}_2)\), we define that \((\varOmega ^{\scriptscriptstyle {U}}_1, \Omega ^{\scriptscriptstyle {L}}_1) \sqsubset (\varOmega ^{\scriptscriptstyle {U}}_2, \Omega ^{\scriptscriptstyle {L}}_2)\) if and only if \(\varOmega ^{\scriptscriptstyle {U}}_1 \cup \Omega ^{\scriptscriptstyle {L}}_1 \subset \varOmega ^{\scriptscriptstyle {U}}_2 \cup \Omega ^{\scriptscriptstyle {L}}_2\). Then, a sequence of contexts \(\varOmega _1, \dots , \varOmega _m\) is monotonically increasing, if \(\varOmega _i \sqsubset \varOmega _{i+1}\) for \(1 \le i < m\). Further, a monotonically increasing sequence of contexts \(\varOmega _1, \dots , \varOmega _m\) is maximal, if \(\varOmega _1 = (\emptyset , \emptyset )\) and \(\varOmega _m = (\varPhi ^{\scriptscriptstyle {U}}, \varPhi ^{\scriptscriptstyle {L}})\) and \(|\varOmega _{i+1}| = |\varOmega _i| + 1\), for \(1 \le i < m\). We obtain:

Proposition 2

Every maximal monotonically increasing sequence of contexts is of length \(|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}| + 1\). There are at most \((|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|)!\,\) such sequences.

Definition 3

Given a threshold automaton, we define its configuration context as a function \(\omega : \varSigma \rightarrow 2^{\varPhi ^{\scriptscriptstyle {U}}} \times 2^{\varPhi ^{\scriptscriptstyle {L}}}\) that for each configuration \(\sigma \in \varSigma \) gives a context \((\varOmega ^{\scriptscriptstyle {U}},\Omega ^{\scriptscriptstyle {L}})\) with \(\varOmega ^{\scriptscriptstyle {U}}= \{ \varphi \in \varPhi ^{\scriptscriptstyle {U}}:\sigma \models \varphi \}\) and \(\Omega ^{\scriptscriptstyle {L}}= \{ \varphi \in \varPhi ^{\scriptscriptstyle {L}}:\sigma \not \models \varphi \}\).

Proposition 3

If a transition t is enabled in a configuration \(\sigma \), then either \(\omega (\sigma ) \sqsubset \omega (t(\sigma ))\), or \(\omega (\sigma ) = \omega (t(\sigma ))\).

We say that a schedule \(\tau \) is steady for a configuration \(\sigma \), if for every prefix \(\tau '\) of \(\tau \), the context does not change, i.e., \(\omega (\tau '(\sigma )) = \omega (\sigma )\).

Proposition 4

A schedule \(\tau \) is steady for a configuration \(\sigma \) if and only if \(\omega (\sigma ) = \omega (\tau (\sigma ))\).

Given a configuration \(\sigma \) and a schedule \(\tau \) applicable to \(\sigma \), we say that \(\mathsf path (\sigma , \tau )\) is consistent with a sequence of contexts \(\varOmega _1, \dots , \varOmega _m\), if the set of indices \(\{0, \dots , |\tau |\}\) can be partitioned into m (possibly empty) disjoint sets \(I_1, \dots , I_m\) such that \(\omega (\tau ^i(\sigma )) = \varOmega _k\), for \(1 \le k \le m\) and \(i \in I_k\).

A context defines which rules of the \(\mathsf TA \) are unlocked. As we consider steady schedules, we need to understand, which rules are unlocked in that schedule:

Definition 4

Given a threshold automaton  and a context \(\varOmega \), we define the slice of TA with context \(\varOmega \) as a threshold automaton , where a rule \(r\in {\mathcal R}\) belongs to \({\mathcal R}|_{\varOmega }\) if and only if \(\big (\bigwedge _{\varphi \in \varOmega ^{\scriptscriptstyle {U}}} \varphi \big ) \rightarrow r.\varphi ^{\scriptscriptstyle {\le }}\) and \(\big (\bigwedge _{\psi \in \varPhi ^{\scriptscriptstyle {L}}\setminus \Omega ^{\scriptscriptstyle {L}}} \psi \big ) \rightarrow r.\varphi ^{\scriptscriptstyle {>}}\).

3.3 Parameterized Reachability

Given a threshold automaton \(\mathsf TA \), a state property B is a boolean combination of formulas that have the form \(\bigwedge _{i \in Y} {\mathbf {\varvec{\kappa }}}[i] = 0\), for some \(Y \subseteq {\mathcal L}\). The parameterized reachability problem is to decide whether there are parameter values \(\mathbf {p}\in \mathbf {P}_{RC}\), an initial configuration \(\sigma _0 \in I\), with \(\sigma _0.\mathbf {p}= \mathbf {p}\), and a schedule \(\tau \) satisfying that \(\tau \) is applicable to \(\sigma _0\), and property B holds in the final state: \(\tau (\sigma _0) \models B\).

4 Main Result: A Complete Set of Schemas

We introduce the notion of a schema that is an alternating sequence of contexts and sequences of rules. A schema serves as a pattern for an infinite set of paths, and can be used to efficiently encode parameterized reachability in SMT. We show how to construct a set of schemas \(\mathcal{S}\) with the following property: for each schedule \(\tau \) and each configuration \(\sigma \), there is a representative schedule, i.e., a schedule that if applied to \(\sigma \), ends in \(\tau (\sigma )\), and is generated by a schema from \(\mathcal{S}\).

Definition 5

A schema is a sequence \(\varOmega _0, \rho _1, \varOmega _1, \dots , \rho _m, \varOmega _m\) of alternating contexts and rule sequences. Often we write \(\{\varOmega _0\} \rho _1 \{\varOmega _1\} \dots \{\varOmega _{m-1}\} \rho _m \{\varOmega _m\}\) for a schema. A schema with two contexts is called simple.

Given two schemas \(S_1 =\varOmega _0, \rho _1, \dots , \rho _k, \varOmega _k\) and \(S_2 = \varOmega '_0, \rho '_1, \dots , \rho '_m, \varOmega '_m\) with \(\varOmega _k = \varOmega '_0\), we define their composition \(S_1\,\circ \,S_2\) to be the schema that is obtained by concatenation of the two sequences: \(\varOmega _0, \rho _1, \dots , \rho _k, \varOmega '_0, \rho '_1, \dots , \rho '_m, \varOmega '_m\).

Definition 6

Given a configuration \(\sigma \) and a schedule \(\tau \) applicable to \(\sigma \), we say that \(\mathsf path (\sigma , \tau )\) is generated by a simple schema \(\{\varOmega \}\,\rho \,\{\varOmega '\}\), if the following hold:

  • For \(\rho = r_1, \dots , r_k\) there is a monotonically increasing sequence of indices \(i(1), \dots , i(m)\), i.e., \(1 \le i(1) < \cdots < i(m) \le k\), and there are factors \(f_1, \dots , f_m > 0\), so that schedule \((r_{i(1)}, f_1), \dots , (r_{i(m)}, f_m) = \tau \).

  • The first and the last states match the contexts: \(\omega (\sigma ) = \varOmega \) and \(\omega (\tau (\sigma )) = \varOmega '\).

In general, we say that \(\mathsf path (\sigma , \tau )\) is generated by a schema S, if \(S = S_1\, \circ \dots \circ \, S_k\) for simple schemas \(S_1, \dots , S_k\) and \(\tau = \tau _1 \cdots \tau _k\) such that each \(\mathsf path (\pi _i(\sigma ), \tau _i)\) is generated by the simple schema \(S_i\), for \(\pi _i = \tau _1 \cdots \tau _{i-1}\) and \(1 \le i \le k\).

The language of a schema S —denoted with \(\mathcal{L}(S)\) —is the set of all paths generated by S. For a set of configurations \(C \subseteq \varSigma \) and a set of schemas \(\mathcal{S}\), we define the set \(\mathsf Reach (C, \mathcal{S})\) to contain all configurations reachable from C via the paths generated by the schemas from \(\mathcal{S}\), i.e., \( \mathsf Reach (C, \mathcal{S}) = \{ \tau (\sigma ) \mid \sigma \in C,\ \exists S \in \mathcal{S}.\ \mathsf path (\sigma , \tau ) \in \mathcal{L}(S) \}\). We say that a set \(\mathcal{S}\) of schemas is complete, if: \(\forall C \subseteq \varSigma .\; \{ \tau (\sigma ) \mid \sigma \in C,\ \tau \text{ is } \text{ applicable } \text{ to } \sigma \} = \mathsf Reach (C, \mathcal{S})\).

In [28, Thm. 1], we introduced a quantity \({\mathcal C}\) that depends on the number of conditions in a TA, and have shown that for every configuration \(\sigma \) and every schedule \(\tau \) applicable to \(\sigma \), there is a schedule \(\tau '\) of length at most \(d=|{\mathcal R}| \cdot ({\mathcal C}+ 1) + {\mathcal C}\) that is also applicable to \(\sigma \) and results in \(\tau (\sigma )\). Hence, by enumerating all sequences of rules of length up to d, one can construct a complete set of schemas:

Corollary 1

For a threshold automaton, there is a complete schema set \(\mathcal{S}_d\) of cardinality \(|{\mathcal R}|^{|{\mathcal R}| \cdot ({\mathcal C}+ 1) + {\mathcal C}}\).

Although the set \(\mathcal{S}_d\) is finite, enumerating all its elements is impractical. We show that there is a complete set of schemas whose cardinality solely depends on the number of guards that syntactically occur in the TA. These numbers \(|\varPhi ^{\scriptscriptstyle {U}}|\) and \(|\varPhi ^{\scriptscriptstyle {L}}|\) are in practice much smaller than the number of rules \(|{\mathcal R}|\):

Theorem 1

For a threshold automaton, there is a complete schema set of cardinality at most \((|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|)!\), where the length of each schema does not exceed \((3 \cdot (|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|) + 2) \cdot |{\mathcal R}|\).

Proof Idea. Construct the set Z of all maximal monotonically increasing sequences of contexts. From Proposition 2, there are at most  \((|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|)!\) maximal monotonically increasing sequences of contexts. Therefore, \(|Z| \le (|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|)!\). Then, for each sequence \(z \in Z\), we do the following:

  1. 1.

    Show that for each configuration \(\sigma \) and each schedule \(\tau \) applicable to \(\sigma \) and consistent with the sequence z, there is a schedule \(s(\tau )\) that has a specific structure, and is also applicable to \(\sigma \). We call \(s(\tau )\) the representative of \(\tau \).

  2. 2.

    Construct a schema and show that it generates all paths of all schedules \(s(\tau )\) found in (1). The length of the schema is at most \((3 \cdot (|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|) + 2) \cdot |{\mathcal R}|\).

To prove Theorem 1, it remains to show existence of a representative schedule and of a schema as formulated in (1)–(2). We do this below in Proposition 9 and Theorem 2 respectively. Before that we consider special cases: when all rules of a schedule belong to the same looplet, and when a schedule is steady.

4.1 Special Case I: One Context and One Looplet

We show that for each schedule that uses only the rules from a fixed looplet and does not change its context, there exists a representative schedule of bounded length that reaches the same final state.

Proposition 5

Fix a threshold automaton, a context \(\varOmega \), and a looplet . Let \(\sigma \) be a configuration and \(\tau = t_1, \dots , t_m\) a steady schedule applicable to \(\sigma \), with \([{t_i.{ rule }}] = c\) for \(1 \le i \le |\tau |\). There exists a representative schedule \(\mathsf {crep}^{\varOmega }_{c}[\sigma , \tau ]\) with the following properties:

  1. (a)

    schedule \(\mathsf {crep}^{\varOmega }_{c}[\sigma , \tau ]\) is applicable to \(\sigma \), and \(\mathsf {crep}^{\varOmega }_{c}[\sigma , \tau ](\sigma ) = \tau (\sigma )\),

  2. (b)

    the rule of each transition t in \(\mathsf {crep}^{\varOmega }_{c}[\sigma , \tau ]\) belongs to c, that is, \([{t.{ rule }}] = c\),

  3. (c)

    schedule \(\mathsf {crep}^{\varOmega }_{c}[\sigma , \tau ]\) is not longer than \(2 \cdot |c|\).

Proof Idea for Proposition  5. If \(|c|=1\), then we use a single accelerated transition or the empty schedule as representative. If \(|c| > 1\), the rules of the slice \(\mathsf TA |_{\varOmega }\) form a strongly connected component. Then, we can choose a node h, and construct two spanning trees: an out-tree, whose edges are pointing away from h, and an in-tree, whose edges are pointing to h. Using the trees, we construct two sequences of rules sorted in the topological order of the trees: the sequence \(r_\mathsf {in}(1), \dots , r_\mathsf {in}(k)\) moves processes to h, and the sequence \(r_\mathsf {out}(1), \dots , r_\mathsf {out}(m)\) distributes the processes from h to the locations. As a result, for each location \(\ell \) in the graph, the processes are transferred from \(\ell \) to the other locations, if \(\sigma [\ell ] > \tau (\sigma )[\ell ]\), and additional processes arrive at \(\ell \), if \(\sigma [\ell ] < \tau (\sigma )[\ell ]\).

Proposition 6

Fix a threshold automaton, a context \(\varOmega \), and a looplet in the slice \(\mathsf TA |_{\varOmega }\). There exists a schema \(\mathsf {cschema}^{\varOmega }_{c}\) with the following properties: For each configuration \(\sigma \) and each steady schedule \(\tau = t_1, \dots , t_m\) applicable to \(\sigma \), if \([{t_i.{ rule }}] = c\) for \(1 \le i \le |\tau |\), then \(\mathsf path (\sigma , \tau ')\) of the representative schedule \(\tau '=\mathsf {crep}^{\varOmega }_{c}[\sigma , \tau ]\) from Proposition 5 is generated by \(\mathsf {cschema}^{\varOmega }_{c}\).

Proof idea. We construct the schema using the same sequence of rules as in Proposition 5, i.e., \(\mathsf {cschema}^{\varOmega }_{c} = \{\varOmega \}\,r_{\mathsf {in}}(1), \dots , r_{\mathsf {in}}(k),\) \(r_{\mathsf {out}}(1), \dots , r_{\mathsf {out}}(m)\,\{\varOmega \}\). It follows that \(\mathsf {cschema}^{\varOmega }_{c}\) generates all paths of the representative schedules.

4.2 Special Case II: One Context and Multiple Looplets

In this section, we show that for each steady schedule, there exists a representative steady schedule of bounded length that reaches the same final state.

Proposition 7

Fix a threshold automaton and a context \(\varOmega \). For every configuration \(\sigma \) with \(\omega (\sigma ) = \varOmega \) and every steady schedule \(\tau \) applicable to \(\sigma \), there exists a steady schedule \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) with the following properties:

  1. (a)

    \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) is applicable to \(\sigma \), and \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ](\sigma ) = \tau (\sigma )\),

  2. (b)

To construct a representative schedule, we fix a context \(\varOmega \) of at TA, a configuration \(\sigma \) with \(\omega (\sigma ) = \varOmega \), and a steady schedule \(\tau \) applicable to \(\sigma \). The key notion in our construction is a projection of a schedule on a set of looplets:

Definition 7

Let \(\tau = t_1, \dots , t_k\) be a schedule and C be a set of looplets. Given an increasing sequence of indices \(i(1), \dots , i(m) \in \{1, \dots , k \}\), i.e., \(i(j) < i(j+1)\), for \(1 \le j < m\), a schedule \(t_{i(1)} \dots t_{i(m)}\) is a projection of \(\tau \) on C, if each index \(j \in \{ 1, \dots , k \}\) belongs to \(\{ i(1), \dots , i(m) \}\) if and only if \([{t_j.{ rule }}] \in C\).

In fact, each schedule \(\tau \) has a unique projection on a set C. In the following, we write \({\tau }|_{c_1,\dots ,c_m}\) to denote the projection of \(\tau \) on a set \(\{c_1,\dots ,c_m\}\).

Provided that \(c_1, \dots , c_m\) are all looplets of the slice  ordered with respect to \(\prec ^{ lin }_{\scriptscriptstyle { C }}\), we construct the following sequences of projections on each looplet (note that \(\pi _0\) is the empty schedule): \(\pi _i = {\tau }|_{c_1} \cdot \dots \cdot {\tau }|_{c_i} \text{ for } 0 \le i \le m\).

Having defined \(\{\pi _i\}_{0 \le i \le m}\), we construct the representative \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) simply as a concatenation of the representatives of each looplet:

$$\begin{aligned} \mathsf {srep}_{\varOmega }[\sigma ,\tau ] = \mathsf {crep}^{\varOmega }_{c_1}[\pi _0(\sigma ), {\tau }|_{c_1}] \cdot \mathsf {crep}^{\varOmega }_{c_2}[\pi _1(\sigma ), {\tau }|_{c_2}] \cdots \mathsf {crep}^{\varOmega }_{c_m}[\pi _{m-1}(\sigma ), {\tau }|_{c_m}] \end{aligned}$$

Lemma 1

(Looplet Sorting). Given a threshold automaton, a context \(\varOmega \), a configuration \(\sigma \), a steady schedule \(\tau \) applicable to \(\sigma \), and a sequence \(c_1, \dots , c_m\) of all looplets in the slice  with the property \(c_i \prec ^{ lin }_{\scriptscriptstyle { C }}c_j\) for \(1 \le i < j \le m\), the following holds:

  1. 1.

    Schedule \({\tau }|_{c_1}\) is applicable to the configuration \(\sigma \).

  2. 2.

    Schedule \({\tau }|_{c_2, \dots , c_m}\) is applicable to the configuration \({\tau }|_{c_1}(\sigma )\).

  3. 3.

    Schedule \({\tau }|_{c_1}\cdot \,{\tau }|_{c_2, \dots , c_m}\), when applied to \(\sigma \), results in configuration \(\tau (\sigma )\).

Proof

(of Proposition 7 ). By iteratively applying Lemma 1, we prove by induction that schedule \({\tau }|_{c_1} \cdot \dots \cdot {\tau }|_{c_m}\) is applicable to \(\sigma \) and results in \(\tau (\sigma )\). From Proposition 5, we conclude that each schedule \({\tau }|_{c_i}\) can be replaced by its representative \(\mathsf {crep}^{\varOmega }_{c_i}[\pi _{i-1}(\sigma ), {\tau }|_{c_i}]\). Thus, \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) is applicable to \(\sigma \) and results in \(\tau (\sigma )\). By Proposition 4, schedule \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) is steady, since \(\omega (\sigma ) = \omega (\tau (\sigma ))\).    \(\square \)

Finally, we show that for a given context, there is a schema that generates all paths of such representative schedules.

Proposition 8

Fix a threshold automaton and a context \(\varOmega \). Let \(c_1, \dots , c_m\) be the sorted sequence of all looplets of the slice , i.e., it holds that \(c_1 \prec ^{ lin }_{\scriptscriptstyle { C }}\dots \prec ^{ lin }_{\scriptscriptstyle { C }}c_m\). Schema \(\mathsf {sschema}_{\varOmega }{} = \mathsf {cschema}^{\varOmega }_{c_1} \circ \mathsf {cschema}^{\varOmega }_{c_2} \circ \cdots \circ \mathsf {cschema}^{\varOmega }_{c_m}\) satisfies: For each configuration \(\sigma \) with \(\omega (\sigma ) = \varOmega \) and each steady schedule \(\tau \) applicable to \(\sigma \), \(\mathsf path (\sigma , \tau ')\) of the representative \(\tau '=\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) is generated by \(\mathsf {sschema}_{\varOmega }\).

Proof

As for an arbitrary configuration \(\sigma \) with \(\omega (\sigma ) = \varOmega \) and a steady schedule \(\tau \) applicable to \(\sigma \), we constructed \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) as a sorted sequence of representatives of the looplets, all paths of \(\mathsf {srep}_{\varOmega }[\sigma ,\tau ]\) are generated by \(\mathsf {sschema}_{\varOmega }\). \(\square \)

4.3 The General Case

Using the results from Sects. 4.1 and 4.2, for each configuration and each schedule (without restrictions) we construct a representative schedule.

Proposition 9

Given a threshold automaton, a configuration \(\sigma \), and schedule \(\tau \) applicable to \(\sigma \), there exists a schedule \(\mathsf {rep}[\sigma ,\tau ]\) with the following properties:

  1. (a)

    \(\mathsf {rep}[\sigma ,\tau ]\) is applicable to \(\sigma \), and \(\mathsf {rep}[\sigma ,\tau ](\sigma ) = \tau (\sigma )\),

  2. (b)

    \(|\mathsf {rep}[\sigma ,\tau ]| \le 2 \cdot |{\mathcal R}| \cdot (|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}| + 1) + |\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|\).

Proof Idea. Consider the maximal monotonically increasing sequence \(\varOmega _0, \dots , \varOmega _m\) such that \(\mathsf path (\sigma , \tau )\) is consistent with the sequence. Thus, \(\tau \) contains at most m transitions that change their context, and schedules between these transitions are steady. By applying Proposition 7, we replace the steady schedules with their representatives and obtain \(\mathsf {rep}[\sigma ,\tau ]\), which is applicable to \(\sigma \) and results in \(\tau (\sigma )\). By Proposition 7, the representative of a steady schedule is not longer than \(2 \cdot |{\mathcal R}|\), which together with m transitions gives us the bound \(2 \cdot |{\mathcal R}| \cdot (m+1) + m\). By Proposition 2, the number m is \(|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|\). This gives us the needed bound.

Further, given a maximal monotonically increasing sequence z of contexts, we construct a schema that generates all paths of the schedules consistent with z:

Theorem 2

For a threshold automaton and a monotonically increasing sequence z of contexts, there exists a schema \(\mathsf {schema}(z)\) that generates all paths of the representative schedules that are consistent with z, and the length of \(\mathsf {schema}(z)\) does not exceed \((3 \cdot |{\mathcal R}| + 1) \cdot (|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|) + 2 \cdot |{\mathcal R}|\).

Proof

Given a threshold automaton, let \(\rho _{\mathsf {all}}\) be the sequence \(r_1, \dots , r_{|{\mathcal R}|}\) of all rules from \({\mathcal R}\), and \(z=\varOmega _0, \dots , \varOmega _m\) a monotonically increasing sequence of contexts. By the construction in Proposition 9, each representative schedule \(\mathsf {rep}[\sigma ,\tau ]\) consists of the representatives of steady schedules terminated with transitions that change the context. Then, for each context \(\varOmega _i\), for \(0 \le i < m\), we compose \(\mathsf {sschema}_{\varOmega }\) with \(\{\varOmega _i\}\,\rho _{\mathsf {all}}\,\{\varOmega _{i+1}\}\). This composition generates the representative of a steady schedule and the transition changing the context from \(\varOmega _i\) to \(\varOmega _{i+1}\). Consequently, we construct the \(\mathsf {schema}(z)\) as follows:

$$\begin{aligned} (\mathsf {sschema}_{\varOmega _0} \,\circ \,\{\varOmega _0\}\,\rho _{\mathsf {all}}\,\{\varOmega _1\}) \circ \dots \circ (\mathsf {sschema}_{\varOmega _{m-1}} \,\circ \,\{\varOmega _{m-1}\}\,\rho _{\mathsf {all}}\,\{\varOmega _{m}\}) \,\circ \,\mathsf {sschema}_{\varOmega _{m}} \end{aligned}$$

By inductively applying Proposition 8, we prove that \(\mathsf {schema}(z)\) generates all paths of schedules \(\mathsf {rep}[\sigma ,\tau ]\) that are consistent with the sequence z. We get the needed bound on the length of \(\mathsf {schema}(z)\) by using an argument similar to Proposition 9 and by noting that we add \(|{\mathcal R}|\) extra rules per context.    \(\square \)

Computing the Complete Set of Schemas. Our proofs show that the set of schemas is easily computed from the TA: The threshold guards are syntactic parts of the TA, and enable us to directly construct increasing sequences of contexts. To find a slice of the TA for a given context, we filter the rules with unlocked guards, i.e., check if the context contains the guard. To produce the simple schema of a looplet, we compute a spanning tree over the slice. To construct simple schemas, we do a topological sort over the looplets. For example, it takes just 30 s to compute the schemas in our longest experiment that runs for 4 h.

4.4 Optimization: Smaller Complete Sets of Schemas

Entailment Optimization. We say that a guard \(\varphi _1 \in \varPhi ^{\scriptscriptstyle {U}}\) entails a guard \(\varphi _2 \in \varPhi ^{\scriptscriptstyle {U}}\), if for all combinations of parameters \(\mathbf {p}\in \mathbf {P}_{RC}\) and shared variables \(\mathbf {g}\in {\mathbb N}_0^{|\varGamma |}\), it holds that \((\mathbf {g}, \mathbf {p}) \models \varphi _1 \rightarrow \varphi _2\). For instance, in our example, \(\varphi _3 :y \ge (2t + 1) - f\) entails \(\varphi _2 :y \ge (t + 1) - f\). If \(\varphi _1\) entails \(\varphi _2\), then we can omit all monotonically increasing sequences that contain a context \((\varOmega ^{\scriptscriptstyle {U}}, \Omega ^{\scriptscriptstyle {L}})\) with \(\varphi _1 \in \varOmega ^{\scriptscriptstyle {U}}\) and \(\varphi _2 \not \in \varOmega ^{\scriptscriptstyle {U}}\). If the number of schemas before applying this optimization is m! and there are k entailments, then the number of schemas reduces from m! to \((m-k)!\). A similar optimization is introduced for the guards from \(\varPhi ^{\scriptscriptstyle {L}}\).

Control Flow Optimization. Based on the proof of Lemma 1, we introduce the following optimization for TAs that are DAGs (possibly with self loops).

We say that a rule \(r \in {\mathcal R}\) may unlock a lower guard \(\varphi \in \varPhi ^{\scriptscriptstyle {U}}\), if there is a \(\mathbf {p}\in \mathbf {P}_{RC}\) and \(\mathbf {g}\in {\mathbb N}_0^{{|\varGamma |}}\) satisfying: \((\mathbf {g}, \mathbf {p}) \models r.\varphi ^{\scriptscriptstyle {\le }}\wedge r.\varphi ^{\scriptscriptstyle {>}}\) (the rule is unlocked); \((\mathbf {g}, \mathbf {p}) \not \models \varphi \) (the guard is locked); \((\mathbf {g}+ r.\mathbf {u}, \mathbf {p}) \models \varphi \) (the guard is now unlocked).

In our example, the rule \(r_1\) may unlock the guard \(\varphi _1\).

Let \(\varphi \in \varPhi ^{\scriptscriptstyle {U}}\) be a guard, \(r'_1, \dots , r'_m\) be the rules that use \(\varphi \), and \(r_1, \dots , r_k\) be the rules that may unlock \(\varphi \). If \(r_i \prec ^{ lin }_{\scriptscriptstyle { C }}r'_j\), for \(1 \le i \le k\) and \(1 \le j \le m\), then we exclude some sequences of contexts as follows (we call \(\varphi \) forward-unlockable). Let \(\psi _1, \dots , \psi _n \in \varPhi ^{\scriptscriptstyle {U}}\) be the guards of \(r_1, \dots , r_k\). Guard \(\varphi \) cannot be unlocked before \(\psi _1, \dots , \psi _n\), and thus we can omit all sequences of contexts, where \(\varphi \) appears in the contexts before \(\psi _1, \dots , \psi _n\). Moreover, as \(\psi _1, \dots , \psi _n\) are the only guards of the rules unlocking \(\varphi \), we omit the sequences with different combinations of contexts involving \(\varphi \) and the guards from \(\varPhi ^{\scriptscriptstyle {U}}\setminus \{\varphi , \psi _1, \dots , \psi _n\}\). Finally, as the rules \(r'_1, \dots , r'_m\) appear after the rules \(r_1, \dots , r_k\) in the order \(\prec ^{ lin }_{\scriptscriptstyle { C }}\), the rules \(r'_1, \dots , r'_m\) appear after the rules \(r_1, \dots , r_k\) in a rule sequence of every schema. Thus, we omit the combinations of the contexts involving \(\varphi \) and \(\psi _1, \dots , \psi _n\).

Hence, we add all forward-unlockable guards to the initial context (we still check the guards of the rules in the SMT encoding in Sect. 5). If the number of schemas before applying this optimization is m! and there are k forward-unlocking guards, then the number of schemas reduces from m! to \((m-k)!\). A similar optimization is introduced for the guards from \(\varPhi ^{\scriptscriptstyle {L}}\).

5 Checking a Schema with SMT

The encoding for a schema is obtained by decomposing the schema into a sequence of simple schemas and encoding the simple schemas. Given a simple schema \(S=\{\varOmega _1\}\,r_1, \dots , r_m\,\{\varOmega _2\}\), we construct an SMT formula such that every model of the formula represents a path from \(\mathcal{L}(S)\), and for every path in \(\mathcal{L}(S)\) there is a corresponding model of the formula. Thus, we need to model a path of \(m+1\) configurations and m transitions (whose acceleration factors may be 0).

To represent a configuration \(\sigma _i\), for \(0 \le i \le m\), we introduce two vectors of SMT variables: a vector \(\mathbf {k}^i = (k^i_1, \dots , k^i_{|{L}|})\) to represent the process counters, a vector \(\mathbf {x}^i = (x^i_1, \dots , x^i_{|\varGamma |})\) to represent the shared variables. We call the pair \((\mathbf {k}^i, \mathbf {x}^i)\) the layer i, for \(1 \le i \le m\).

A straightforward way to represent a bounded computation of length m is to encode the choice of a rule from \({\mathcal R}\) and to encode all the rules from \({\mathcal R}\) for each layer. In any case, we do not encode bounded computation but rather schemas, for which the sequence of rules \(r_1, \dots , r_m\) is fixed. We exploit this in two ways: First, instead of encoding the choice of a rule and encoding all rules, we encode for each layer i the constraints of rule \(r_i\). Second, as this constraint may update only two counters —\(r_i.{ from }\) and \(r_i.{ to }\) —we do not need \(|{L}|\) counter variables per layer, but only encode the two counters per layer that have actually changed. As is a common technique in bounded model checking, the counters that are not changed are “reused” from previous layers in our encoding. By doing so, we encode the schema rules with \(|{L}| + |\varGamma | + m \cdot (2 + |\varGamma |)\) integer variables, 2m equations, and at most \(m \cdot (|\varPhi ^{\scriptscriptstyle {U}}| + |\varPhi ^{\scriptscriptstyle {L}}|)\) inequalities over linear integer arithmetic.

Table 1. Summary of our experiments on AMD Opteron®6272, 32 cores, 192 GB. The symbols are: “
figure b
” for timeout of 24 h.; “
figure c
” for memory overrun of 32 GB; “
figure d
” for BDD nodes overrun; “
figure e
” for timeout in the refinement loop (24 h.); “
figure f
  ” for spurious counterexamples due to counter abstraction. \( ^\star \) In these cases, we used the control flow optimization from Sect. 4.4.

6 Experiments

Implementation. We have implemented the technique in our tool ByMC (Byzantine Model Checker [2]), which integrates with an SMT solver via the interface provided by SMTLIB2. In our experiments, we used Z3 [17] as back-end solver.

Benchmarks. We revisited several asynchronous FTDAs that we evaluated in previous work [25, 28]. In addition to these classic FTDAs, we considered asynchronous (Byzantine) consensus algorithms —namely, BOSCO [41], C1CS [10], and CF1S [18] —that are designed to work despite partial failure of the distributed system. All our benchmarks, their source code in our parametric extension of Promela, and the code of the threshold automata are freely available [1].

The challenge in the verification of FTDAs is the immense non-determinism caused by interleavings, asynchronous message passing, and faults. In our modeling, all these are reflected in non-deterministic choices in the Promela code. To obtain threshold automata, as required for our technique, our tool constructs a parametric interval data abstraction [25] that adds to non-determinism.

Evaluation. Table 1 summarizes our experiments conducted with nuXmv, FAST, and our new implementation. We evaluated four different tool configurations: our new implementation (SMT); our previous implementation that checks the counter abstraction with nuXmv [11], either using binary decision diagrams (BDD), or SAT-based bounded model checking (BMC); and the acceleration-based tool FAST [4]. We compare our results with FAST, as TAs can be encoded with counter automata [3], which FAST receives at its input. For FAST, we give only the figures using the Mona plugin, which produced the best results in our experiments. For BMC, our tool first generates a SAT formula with nuXmv and then calls the solver Lingeling [6] to check satisfiability in non-incremental mode. This works better than the incremental mode with MiniSAT, built into nuXmv.

On large problems, our new technique works significantly better than BDD- and SAT-based model checking. BDDs work extremely well on smaller problems. Importantly, our new technique does not use abstraction refinement.

NBAC and NBACC are challenging as the model checker produces many spurious counterexamples, which are an artifact of counter abstraction losing or adding processes. When using SAT-based model checking, the individual calls to nuXmv are fast, but the abstraction-refinement loop times out, due to a large number of refinements (about 500). BDD-based model checking times out when looking for a counterexample. Our new technique, preserves the number of proceses, and thus, there are no spurious counterexamples of this kind.

In comparison to the general-purpose acceleration tool FAST, our tool uses less memory and is faster on the benchmarks where FAST is successful.

As predicted by the distributed algorithms literature, our tool finds counterexamples, when we relax the resilience condition. In contrast to counter abstraction, our new technique gives concrete values of the parameters and shows how many processes move at each step.

Our new method uses integer counters and thus does not introduce spurious behavior caused by counter abstraction, but still has spurious counterexamples from parameterized data abstraction for complex FTDAs such as BOSCO, C1CS, NBAC, and NBACC. In these cases, we manually refine the interval domain by adding new symbolic interval borders, see [25]. We believe that these interval borders can be derived directly from the TA, so that no refinement is necessary in the first place, and leave this question to future work.

7 Discussions

We introduced a method to efficiently check reachability properties of FTDAs in a parameterized way. If \(n>7t\) as for BOSCO, even the simplest interesting case with \(t=2\) leads to a system size that is out of range of explicit state model checking. Hence, FTDAs force us to develop parameterized verification methods.

The problem we consider is concerned with parameterized model checking, for which many interesting results exist [14, 15, 1921, 26]. However, the FTDAs considered by us run under the different assumptions. In [28], we discuss the relation between partial orders in accelerated counter systems of threshold automata and the following work: compact programs [35], counter abstraction [5, 39], completeness thresholds [7, 16, 29], partial order reduction [8, 22, 38, 43], and Lipton’s movers [34]. We also discussed their relation to counter automata. Indeed, our result entails flattability [33] of every counter system of threshold automata: a complete set of schemas immediately gives us a flat counter automaton. Hence, the acceleration semi-algorithms [3, 33] should terminate on the systems of TAs, though it rarely happens in our experiments. Further, our execution schemas are inspired by a general notion of semi-linear path schemas SLPS [32, 33]. We construct a small complete set of schemas and thus a provably small SLPS. Besides, in our work we distinguish counter systems and counter abstraction: the former counts processes as integers, while the latter uses counters over a finite abstract domain, e.g., \(\{0,1, many \}\) [39].

Many distributed algorithms can be specified with I/O Automata [36] or TLA+ [31]. In these frameworks, correctness is typically shown with a proof assistant, while model checking is used as a debugger on small instances. Parameterized model checking is not a concern there, except one notable result [24].

Finally, to verify all properties of FTDAs, we have to check that they are not only safe, but also progress. Liveness properties is a subject to ongoing work.