Keywords

1 Introduction

Distributed systems are ubiquitous in the modern world, with many companies directly relying on them to conduct business. Due to this, the ability to ensure that a distributed system is operating correctly is paramount. The search for correctness guarantees led to an influx of interested parties adopting formal verification methodologies in recent years. One of the most famous example of this trend is probably the adoption of TLA\(^+\) [17] by Amazon Web Services [19]. TLA\(^+\) is a specification language based on the temporal logic of actions (TLA) which allows users to describe the expected behaviour of a system, while abstracting away implementation details that do not impact high-level properties, e.g., memory management. With TLA\(^+\) specifications at hand, Amazon engineers rely on model checking for correctness guarantees of systems such as DynamoDB [23].

Despite recent interest and advances, the verification of distributed systems remains notoriously difficult. This is mainly due to the fact that, given their distributed nature, distributed algorithms’ executions admit numerous potential interleavings of steps, with state-spaces generally growing exponentially with the number of participants. In the case of TLA\(^+\), a handful of tools are available to aid in verification [14]. TLC [27] is an explicit-state model checker that enumerates all reachable states of the given system. Apalache [13] is a symbolic bounded model checker that uses a satisfiability modulo theories (SMT) encoding of states in order to better tackle the state-space explosion problem. TLAPS [6] is an interactive proof system that enables the proving of properties without the need of exploring the state-space itself. Despite providing the benefit of verifying specifications with infinite state-spaces, and efforts being made towards partial automation [18], TLAPS adoption is still slow, with engineers favouring the push-button automation provided by model checkers.

In this work we focus on symbolic model checking for TLA\(^+\), as spearheaded by the SMT encoding which underpins Apalache, but provide insights into SMT-based model checking that may generalise to other contexts. The encoding of TLA\(^+\) into SMT done by Apalache removes all structural information present in the encoded specification, with all TLA\(^+\) data structures being represented via uninterpreted constants in the generated SMT formula. The information not forwarded to the SMT solver has the potential to significantly improve solving efficiency. We propose an alternative SMT encoding that makes full use of the SMT theory of arrays [8] to encoded the main TLA\(^+\) data structures, i.e., sets and functions, with the goal of improving solving performance, which is the determining factor in overall model checking performance.

Concretely, we modify Apalache’s abstract reduction system (ARS) to generate constraints in the SMT theory of arrays, while relying on its preprocessing infrastructure, as shown in Figure 1. Apalache rewrites the input specification into the KerA\(^+\) verification-friendly fragment of TLA\(^+\) [13] and then applies ARS rules to generate the SMT formula to be solved. We implemented our encoding in Apalache and compared it with Apalache’s constants encoding and TLC. Our experiments indicate that embedding structural information into the SMT formulas has a significant impact on performance. Our contributions are:

Fig. 1.
figure 1

Overview of the symbolic model checking for TLA\(^+\). The dotted box highlights the identification of symbolic transitions from [16] and the rewriting into KerA\(^+\). The dashed box highlights the encoding based on uninterpreted constants from [13]. The solid box highlights the arrays-based encoding we propose.

  1. 1.

    Formalisation of a TLA\(^+\) encoding into the SMT theory of arrays;

  2. 2.

    Development of a robust open-source implementation of our encoding;

  3. 3.

    Evaluation via checking agreement on three asynchronous protocols.

The paper is structured as follows: background is given in Section 2, the arrays-based encoding and its evaluation are presented in Sections 3 and 4, related work is discussed in Section 5, and our final remarks are made in Section 6.

2 Background

In this section we introduce the basics of TLA\(^+\), its KerA\(^+\) fragment used to represent TLA\(^+\)’s core, the approach to generate SMT constraints from KerA\(^+\) via abstract reduction, and finally the SMT theory of arrays.

2.1 TLA+

We introduce TLA\(^+\) via a specification of the asynchronous Byzantine agreement protocol by Bracha and Toueg [5], shown in Figure 2. Here we focus on the most relevant TLA\(^+\) constructs, with further details being available in [17].

Fig. 2.
figure 2

Example of a TLA\(^+\) specification, based on the asynchronous Byzantine agreement protocol by Bracha and Toueg [5]; simplifications made for brevity.

The first notable aspect of TLA\(^+\) is that specifications may be parametrised, e.g., the number of processes and faults may not be fixed. In our example, the keyword \(\textsc {constants}\), in line 3, is used to declare its parameters: N, the total number of processes, and T and F, the maximal and actual number of faulty processes. It is important to understand, however, that while a specification may be parametrised, model checking can only be carried out for a specific instance of the protocol at a time, e.g., \(N=4\) and \( T=F=1\). Parameter declarations are followed by variable declarations, by the use of the \(\textsc {variables}\) keyword, in line 4. Variables define the states of the state-machine that the specification describes, with each state being defined by the combination of the values held by each variable. In our example, each state is defined by the values of sentEchosentReadyrcvdEchorcvdReady, and pc.

The remaining TLA\(^+\) operators describe state-machine transitions or properties to be checked, and are defined using \(\smash {{\mathop {=}\limits ^{\scriptscriptstyle \mathrm {\Delta }}}}\). Two operators are of special significance, one that defines the initial-state predicate and one that plays the role of the transition operator. In our example, these operators are Init, in line 8, and Next, in line 22. Concretely, Init defines the starting point for state-space exploration and Next defines the exploration itself. Transitions are guided by constraints that must hold in both pre-transition states, represented by non-primed variables, and post-transition states, represented by primed variables.

Specifications may optionally define invariants, i.e., properties that should hold in every reachable state. There is no special syntax for invariants, and they are provided by name to model checkers at invocation time. In our example, we have one invariant, NoDecide, in line 26. A specification satisfies NoDecide if no state reachable from Init via any number of Next transitions has \(pc[p] = ``AC"\), for some \(p \in Corr\). Abstractly, this invariant holds iff Decide can never be taken.

2.2 KerA+

TLA\(^+\) provides users with a myriad of ways of specifying systems. This richness, although being one its strengths, adds significant difficulty to the generation of SMT constraints. To overcome this challenge, TLA\(^+\) specifications are rewritten into a more compact language, KerA\(^+\), before being checked. From KerA\(^+\), the ARS can generate SMT constraints in a simpler and provably sound way.

The KerA\(^+\) language consists of a small subset of TLA\(^+\) conjoined with four additional constructs not originating from TLA\(^+\), and is able to express almost all TLA\(^+\) expressions. It contains constructs for the manipulation of sets, functions, records, tuples, and sequences, as well as integer arithmetic operators, Boolean and integer literals, and constants, with all data structures having a bounded size. The semantics of KerA\(^+\) derive directly from the TLA\(^+\) constructs it uses, with the non-TLA\(^+\) based constructs, which help simplify the rewriting system, having simple control semantics. The correctness of the rewriting itself is guaranteed by construction. One example is the rewriting of \(S \cup T\) into the set comprehension \(\{ x \in S: x \in T \}\). Further KerA\(^+\) details are available in [13].

2.3 Abstract Reduction System

In order to verify a specification in KerA\(^+\) we generate a SMT formula that is equisatisfiable to it. To do so, we use an abstract reduction system (ARS) which iteratively applies reduction rules that transform KerA\(^+\) expressions into SMT constraints. The core of the ARS is the arena, a graph structure that overapproxiamtes the specification’s data structures and guides rule application. The rules collapse KerA\(^+\) expressions into cells, which represent the symbolic evaluation of these expressions, with the cells then being used as vertices in the arena. The arena edges represent the data structures overapproximation, e.g., a cell representing a set will have directed edges to the cells representing all its potential elements, as illustrated in Figure 3. The reduction process terminates when the initial KerA\(^+\) expression e is collapsed into a single cell \(\textsf {c}\), producing a SMT formula \({\mathrm \Phi }\) in the process, such that \(\textsf {c}\wedge {\mathrm \Phi }\) is equisatisfiable to e; equisatisfiability relies on the boundedness of the data structures and is detailed in Section 3.3. The satisfiability of e can then be checked by forwarding \(\textsf {c}\wedge {\mathrm \Phi }\) to a SMT solver.

Fig. 3.
figure 3

Illustration of three arenas. The captions describe the modelled elements with the overapproximation \(\textsf {c}_1 = 5\), \(\textsf {c}_2 = 6\), \(\textsf {c}_3 = 7\), \(\textsf {c}_4 = \{5, 6\}\), \(\textsf {c}_5 = \{6, 7\}\), and \(\textsf {c}_6 = \{\{5, 6\}, \{6, 7\}\}\). Note that the concrete value of a cell can be given by any of the possible subtrees having said cell as a root, e.g., for \(\textsf {c}_6\) we have that \(\exists ~ \textsf {c}_4 \in \mathcal {P}(\{5, 6\}), \textsf {c}_5 \in \mathcal {P}(\{6, 7\}) ~.~ \textsf {c}_6 \in \mathcal {P}(\{\textsf {c}_4, \textsf {c}_5\})\); \(\mathcal {P}\) stands for power set.

Formally, the ARS is defined as , with \({\mathcal S}\) being the set of ARS states and being the transition relation. A state \((e, {\mathcal A}, \nu , {\mathrm \Phi }) \in {\mathcal S}\) is a four-tuple containing a KerA\(^+\)expression e, an arena \({\mathcal A}\), a binding of names to cells \(\nu \), and a first-order formula \({\mathrm \Phi }\). ARS states’ elements contain a number of cells, which are first-order terms annotated with a type \(\tau \). Cells of type \(\textsf{Bool}\) and \(\textsf{Int}\) are interpreted in SMT as Booleans and integers, while cells of the remaining types are encoded as uninterpreted constants in the constants encoding; the arrays encoding approach is discussed in Section 3. Cells are referred to via the notation \(\textsf {c}_{ name }\) or \(\textsf {c}_{ index }\), and they can be seen as both KerA\(^+\) constants and first-order terms in SMT. An arena is a directed acyclic graph \({\mathcal A}= ({\mathcal V}, {\mathcal E})\), with \({\mathcal V}\) being a finite set of cells and \({\mathcal E}\subseteq {\mathcal V}\times (1..|{\mathcal V}|) \times {\mathcal V}\) being a set of relations between the cells in \({\mathcal V}\). Every relation between cells is represented by an arena edge of form \((\textsf {c}_a, i, \textsf {c}_b)\), also written \(\textsf {c}_a \smash {\xrightarrow {i}_{}} \textsf {c}_b\), with no duplicates, i.e., for every pair \((\textsf {c}_{a_1}, i_1, \textsf {c}_{b_1}), (\textsf {c}_{a_2}, i_2, \textsf {c}_{b_2}) \in {\mathcal E}\) we have that \(\textsf {c}_{a_1} = \textsf {c}_{a_2} \wedge \textsf {c}_{b_1} \ne \textsf {c}_{b_2}\) implies \(i_1 \ne i_2\), and no gaps in the relation indexes, i.e., for every edge \((\textsf {c}_a, i, \textsf {c}_b)\) and index \(j \in 1..(i-1)\) we have that \(\exists ~ \textsf {c}_c \in {\mathcal V}~.~ (\textsf {c}_a, j, \textsf {c}_c)\). A binding is a partial function from KerA\(^+\) variables to \({\mathcal V}\) of \({\mathcal A}\), i.e., a mapping from variables to cells. Finally, \({\mathrm \Phi }\) is a formula in the SMT fragment supported by the ARS and the target SMT solver, e.g., the quantifier-free uninterpreted functions and non-linear arithmetics (QF_UFNIA) fragment supported by the constants encoding.

A series of n reduction steps has the form , with each step generating state \({ s _{i+1}}\) for state \({ s _{i}}\), \(0 \le i < n\), by applying a reduction rule. The initial state \({ s _{0}} = (e_0, {\mathcal A}_0, \nu _0, {\mathrm \Phi }_0)\) has \(e_0\) as the initial KerA\(^+\) specification, \({\mathcal A}_0 = (\emptyset , \emptyset )\), \(\nu _0\) containing no mappings, and \({\mathrm \Phi }_0 = \textsf{true}\). The reduction steps end upon reaching a state \({ s _{n}} = (e_n, {\mathcal A}_n, \nu _n, {\mathrm \Phi }_n)\), with \(e_n\) being a single cell \(\textsf {c}\in {\mathcal V}_n\) and \({\mathcal A}_n = ({\mathcal V}_n, {\mathcal E}_n)\). Below we give two examples of rules.

Integer literal reduction. One of the simplest rules has an integer literal num being rewritten into a cell \(\textsf {c}_{ num }\). This cell is added to the arena and a constraint equating \(\textsf {c}_{ num }\) to the literal is conjoined with \({\mathrm \Phi }\); we use vertical lines to separate state elements and commas to indicate additions to \({\mathcal A}\) and conjunctions to \({\mathrm \Phi }\).

figure d

The descriptions of rules can be given as inferences, with the premisses above the bar and the resulting state below it. Inferences, although reasonable to express rules such as Int, are not suitable to give the intuition about how more complex rules work. In light of this, we will use a simplified notation moving forward. We inline inferences as \({\rightarrowtail }\) and omit nonessential information, e.g., propagated values. Below we can see rule Int in this simplified format. Note that only \({\mathcal A}\) and \({\mathrm \Phi }\) updates are shown, without propagating them, and that \(\nu \) is omitted.

figure e

Picking. To pick a cell out of n cells we use an oracle \(\theta \), as per rule FromBasic. In addition to the \(\textsf{FROM}~ ... ~\textsf{BY}~ \theta \) expression, this rule requires that all pickable cells are of the same basic type \(\tau \), e.g., \(\textsf{Int}\). The resulting state has a new cell \(\textsf {c}_{ pick }\), which is equated to one of the n cells if \(1 \le \theta \le n\) and is unconstrained otherwise. Picking among cells representing data structures, e.g., sets, can be done via a more general version of rule FromBasic, which we omit for brevity.

figure f

2.4 SMT Theory of Arrays

The theory of arrays provides a natural way to encode data structures and is thus a prime candidate as an encoding target for TLA\(^+\)constructs. Here we present the theory’s operators relevant for our work, further details can be found in [8].

Given the set of sorts \({\textsf{S}}\), containing one sort \({\textsf{s}_{\tau }}\) for each type \(\tau \) in KerA\(^+\), an array sort \({\textsf{s}_{\tau _1, \tau _2}}\) has the form \({\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}}\), with \({\textsf{s}_{\tau _1}} \in {\textsf{S}}\) being its index sort and \({\textsf{s}_{\tau _2}} \in {\textsf{S}}\) being its value sort. Each array sort is supported by two basic operators, \( select : ({\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}}, {\textsf{s}_{\tau _1}}) \rightarrow {\textsf{s}_{\tau _2}}\), which handles array access at a given index, and \( store : ({\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}}, {\textsf{s}_{\tau _1}}, {\textsf{s}_{\tau _2}}) \rightarrow {\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}}\), which updates an array for a given index and value. For brevity, we will write \( select (a,i)\) as a[i] in the remainder of the manuscript. Regarding equality between arrays, different interpretations are possible. We use arrays with extensionality [25], which are considered equal if they contain the same values in the same entries. Extensionality is formally defined as \(\forall ~ a,b: {\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}} ~.~ a = b \vee \exists ~ i: {\textsf{s}_{\tau _1}} ~.~ a[i] \ne b[i]\). For access and update, consistency is ensured by the following property:

$$\begin{aligned} \begin{array}{c}\forall ~ a: {\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}}, ~ i: {\textsf{s}_{\tau _1}}, ~ j: {\textsf{s}_{\tau _1}}, ~ v: {\textsf{s}_{\tau _2}} ~ . ~ \\ \underbrace{ store (a,i,v)[i] = v}_{\text {access consistency}} \wedge \underbrace{(i = j \vee store (a,i,v)[j] = a[j])}_{\text {update consistency}} \end{array}\end{aligned}$$

In addition to select and store, the theory of arrays can be extended with other operators, two of which are map\(_f\) and K\(_{{\textsf{s}_{\tau }}}\), whose signatures are shown below. The map\(_f\) operator applies a n-ary function \(f: ({\textsf{s}_{\tau _1}}, ..., {\textsf{s}_{\tau _n}}) \rightarrow {\textsf{s}_{\tau }}\) to the values stored in each index of its array arguments, producing a new array whose values are the result of the function application, i.e., map\(_f\) is the pointwise array extension of f. The K\(_{{\textsf{s}_{\tau }}}\) operator produces a constant array, with all its values being the constant provided as argument. The properties defining the behaviour of these two operators are shown after their signatures.

$$\begin{aligned} map_f: ({\textsf{s}_{\tau }} \Rightarrow {\textsf{s}_{\tau _1}}, ..., {\textsf{s}_{\tau }} \Rightarrow {\textsf{s}_{\tau _n}}) \rightarrow {\textsf{s}_{\tau }} \Rightarrow {\textsf{s}_{\tau _f}}{} & {} {} & {} {} & {} K_{{\textsf{s}_{\tau }}}: {\textsf{s}_{\tau _{ const }}} \rightarrow {\textsf{s}_{\tau }} \Rightarrow {\textsf{s}_{\tau _{ const }}} \end{aligned}$$
$$\begin{aligned} \forall ~ a_1: {\textsf{s}_{\tau }} \Rightarrow {\textsf{s}_{\tau _1}}, ~ ..., ~ a_n: {\textsf{s}_{\tau }} \Rightarrow {\textsf{s}_{\tau _n}}, ~ i: {\textsf{s}_{\tau }} ~ . ~ map_f(a_1, ..., a_n)[i] = f(a_1[i], ..., a_n[i]) \end{aligned}$$
$$\begin{aligned} \forall ~ i: {\textsf{s}_{\tau _1}}, ~ v: {\textsf{s}_{\tau _2}} ~ . ~ K_{{\textsf{s}_{\tau _1}}}(v)[i] = v \end{aligned}$$

The select and store operators are part of theory of arrays with extensionality defined in version 2.6 of the SMT-LIB standard [3]. Other operators are provided on a solver-by-solver basis, e.g., Z3 [7] supports both map\(_f\) and \(K_{{\textsf{s}_{\tau }}}\), while CVC5 [2] supports \(K_{{\textsf{s}_{\tau }}}\); SMT-LIB updates may add them to the standard.

3 Encoding TLA+ using Arrays

Our goal is to encode TLA\(^+\) data structures in a structure-preserving way. To do this, we use arrays to represent the main components of TLA\(^+\), sets and functions, as SMT constraints. We follow the ARS structure described in Section 2.3, but update the reduction rules handling sets and functions. The remaining TLA\(^+\) constructs, e.g., tuples, are represented as per the constants encoding.

The two efficiency benefits of the arrays encoding are the ease of access of data structures and the possibility of using SMT equality. The first benefit can be easily understood by the use of SMT select, which allows us to check a stored value by using a single constraint, in contrast to the amount of constraints used in the constants encoding, which is linear in the size of data structures’ overapproximation. The second benefit affects the comparison of data structures, which can be done via a single SMT equality for sets and functions in the arrays encoding, since these structures are represented by a single SMT term, while the constants encoding requires a number of constraints that is quadratic in the size of data structures’ overapproximation. A summary can be seen in Table 1. We first describe how to encode sets and functions, and then present the correctness argument for the reduction to arrays.

3.1 Encoding TLA+ Sets using Arrays

We use arrays to encode TLA\(^+\) sets as characteristic functions, i.e., a set of type \(\tau \) is represented by an array of sort \({\textsf{s}_{\tau }} \Rightarrow \text {Bool}\). Set membership is encoded by storing \(\textsf{true}\) or \(\textsf{false}\) on a given array index. The reduction rules used to handle the main set operators are presented below.

Set Enumeration.

The simplest way to create a set is to enumerate its elements. Rule Enum reduces an explicit set of cells to a fresh cell \(\textsf {c}_{ set }\), whose edges link it to its elements; \(\textsf {c}_{ set } \smash {\rightarrow _{}} \textsf {c}_1, \dots , \textsf {c}_n\) is a shorthand for \(\textsf {c}_{ set } \smash {\xrightarrow {1}_{}} \textsf {c}_1, ..., \textsf {c}_{ set } \smash {\xrightarrow {n}_{}} \textsf {c}_n\). There is no guarantee that the enumerated elements are unique, thus the arena may contain edges to repeated elements.

figure g

The constraints EnumCtr added by the arrays encoding create an empty set, by using a constant array with the value \(\textsf{false}\), \(\bot \), and updates the array by storing \(\textsf{true}\), \(\top \), on the appropriate indexes. The array resulting from the last update, \(a^n_{\textsf {c}_{ set }}\), is then equated to \(\textsf {c}_{ set }\). Since cells representing repeated elements lead to updates to the same index, we encode standard sets, in contrast the constants encoding, which encodes multisets due to the arena imprecision; multisets lead to multiple constraints being generated to encode membership of a single element.

figure h

Although the amount of constraints generated by the arrays encoding to model set enumeration is equal to that of the constants encoding, it has the benefit of generating a defined interpretation for \(\textsf {c}_{ set }\), the array \(a^n_{\textsf {c}_{ set }}\), which is not present in the constants encoding. This has a significant impact on set membership and cell equality, as described below.

Table 1. Amount of constraints generated by each SMT encoding to model the main TLA\(^+\) constructs.

Set Membership. The checking of a membership relation \(\textsf {c}_x \in \textsf {c}_{ set }\), given the presence of the arena edges \(\textsf {c}_{ set } \smash {\rightarrow _{}} \textsf {c}_1, ..., \textsf {c}_n\) and \(1 \le x \le n\), is straightforward. A single fresh cell of Boolean type is introduced and is equated to \(\textsf {c}_{ set }[\textsf {c}_x]\).

Cell Equality. The constraints generated by encoding set membership and many other constructs assume that cells can be compared. When this is not directly the case the equalities are cached in preparation. For example, if a set of n tuples \(\textsf {c}_t\) of size two is being equated, the constraints \(\textsf {c}_{t_i} = \textsf {c}_{t_j} \leftrightarrow \textsf {c}^1_{t_i} = \textsf {c}^1_{t_j} \wedge \textsf {c}^2_{t_i} = \textsf {c}^2_{t_j}\), with \(1 \le i \le n\) and \(1 \le j \le n\), are added to \({\mathrm \Phi }\); here we use \(\textsf {c}^1_t\) and \(\textsf {c}^2_t\) to represent the values of the 2-tuple. The need for this caching of equalities only arises when data structures encoded as uninterpreted constants are compared. For the remaining rules we assume that caching was done, if needed, and cells can be compared via direct equality.

Set Filter. In TLA\(^+\), the elements of a set S can be filtered by a predicate p via the expression \(\{x \in S: p\}\). This expression will create a set F which contains only the elements of S that satisfy p, e.g., \(\{x \in \{-1, 0, 1\}: x \ge 0\} = \{0, 1\}\). Rule Filter reduces a filter to a new set cell, \(\textsf {c}_F\), whose arena overappoximation contains the elements of S, but whose constraints ensure that only filtered elements are members of F; p[y/x] means that x is replaced by y in p and parentheses indicate the application of another rule, the predicate resolution rule in this case.

figure i

The constraints added use an array \(a^0_{\textsf {c}_F}\) initially unconstrained, i.e., the values mapped by all the indexes of \(a^0_{\textsf {c}_F}\) are unconstrained, as opposed to \(a^0_{\textsf {c}_{set}}\) in EnumCtr. The values of \(a^0_{\textsf {c}_F}\) mapped by indexes \(\textsf {c}_1, \dots , \textsf {c}_n\) are constrained by \(\textsf {c}^p_1, \dots , \textsf {c}^p_n\) via array access, i.e., \(a^0_{\textsf {c}_F}[c_i]\) is asserted to be \(\textsf{true}\) or \(\textsf{false}\) based on \(\textsf {c}^p_i\), with \(1 \le i \le n\). We then apply pointwise conjunction to \(\textsf {c}_S\) and \(a^0_{\textsf {c}_F}\) via the map\(_f\) SMT operator; we go from \(a^0_F\) to \(a^n_F\) to keep the array index in step with the arena overapproximation. Indexes whose values were \(\textsf{false}\) in S remain so in F, and indexes whose values were \(\textsf{true}\) in S store the filter’s predicate evaluation.

figure j

Both encodings generate a linear amount of constraints, since n \(p[\textsf {c}_i/x]\) predicates have to be considered. Unlike with EnumCtr, FilterCtr does not contain many store operations, due to the usage of map\(_f\). This avoids the need to create intermediary arrays, and is not possible in EnumCtr due to its constant array.

Set Map. The expression \(\{e: x \in S\}\) can be used to construct a set M from a set S, having all the elements of M as e[y/x], with \(y \in S\). For example, the expression \(\{x \div 5: x \in \{4,5,6\}\}\) yields the set \(\{0,1\}\), with \(\div \) denoting standard integer division. To reduce set map we use rule Map.

figure k

The constraints added in rule Map are similar to those added in rule Enum. The difference between them is that set enumeration precisely defines the elements to be added to the new set cell, while set map is based on an existing set cell, which is a set overapproximation. Due to this, membership in M has to be guarded by membership in S, leading to a linear amount of constraints being generated.

figure l

3.2 Encoding TLA+ Functions using Arrays

We use arrays to encode TLA\(^+\) functions directly as functions themselves. To do this, arrays are used in their general format, with a function \(f: {\textsf{s}_{\tau _1}} \rightarrow {\textsf{s}_{\tau _2}}\) being encoded as an array of sort \({\textsf{s}_{\tau _1}} \Rightarrow {\textsf{s}_{\tau _2}}\). Since functions with a finite domain can rely on infinite sorts, e.g., the integer numbers, the encoding of each function also includes constraints defining its domain set, by means of the rules described in the previous section; the result of a function application to a value outside its domain is undefined in TLA\(^+\). This approach allows us to generate SMT constraints that follow directly from TLA\(^+\), making the encoding not only more efficient, but also more natural to describe. In contrast, the constants encoding represents functions explicitly as sets of pairs of form \(\{\langle x,f(x)\rangle : x \in \textsc {domain}f\}\). Due to this, its function manipulation relies on set manipulation, e.g., function comparison is encoded as set comparison, leading to a quadratic amount of constraints. The reduction rules used to handle functions are presented below.

Function Definition. The definition of a function in TLA\(^+\) is an expression of the form \([x \in S \mapsto e]\), which maps every domain value v to the expression e[v/x]. This definition is similar to that of set map \(\{e: x \in S\}\), and thus generates constraints in a similar fashion to rule Map. The main difference is that the evaluations of the expression e[v/x] are stored as array values, rather than array indexes, i.e., function definition uses \( store (a, v, e[v/x])\) and set map uses \( store (a, e[v/x], \top )\), with v being a value in the function’s domain or the set being mapped. Every encoded function has a single argument, with multiple arguments being rewritten as tuples in preprocessing.

Unlike with set cells, a function cell \(\textsf {c}_F\) in the arena does not directly point to its values, with the arrays encoding adding two edges to \(\textsf {c}_F\), \(\textsf {c}_F \smash {\xrightarrow {1}_{}} \textsf {c}_{F_{ dom }}\) and \(\textsf {c}_F \smash {\xrightarrow {2}_{}} \textsf {c}_{F_{ pairs }}\). Cell \(\textsf {c}_{F_{ dom }}\) represents the function’s domain and cell \(\textsf {c}_{F_{ pairs }}\) represents the set of pairs \(\{\langle x,f(x)\rangle : x \in \textsc {domain}f\}\). Cell \(\textsf {c}_{F_{ pairs }}\), despite being in the arena, has no SMT constraints modelling it in the arrays encoding, with its sole purpose being to help propagate the arena edges of the function’s codomain elements.

Function Domain. Accessing a function’s domain is trivial in the arrays encoding, since the domain set is generated during function definition. This results in a simple access to the array representing the domain.

Function Update. The update of a TLA\(^+\) function f is done by changing the result of applying f to an argument arg, f[arg], to be a given value v, via the expression \([f \text { EXCEPT! } [arg] = v]\). The update will produce a new function g which is identical f, except that \(g[arg] = v\) if \(arg \in \textsc {domain}f\). The arrays encoding generates a single array update constraint in this case.

Function Application. The application of a function to an argument arg is conceptually simple, but is quite intricate to realize, as can be seen in rule FunApp. The arrays encoding uses an oracle to check that \(\textsf {c}_{ arg }\) is in the domain and to gather the arena edges of \(\textsf {c}_{ res }\). The FunAppCtr constraints ensure that the oracle chooses the correct index and equates the result cell to an array access on \(\textsf {c}_F\). Note that the value of \(\textsf {c}_{ res }\) comes directly from the function application expression itself, with the oracle only been needed to gather the arena edges of \(\textsf {c}_{ res }\), if \(m > 0\), via \(\textsf {c}^p\). The need for an oracle is restricted to functions whose codomain contain structured data, e.g., \(f: \textsf{Int}\rightarrow \textsf{Set}[\textsf{Int}]\). If this is not the case, e.g., \(g: \textsf{Int}\rightarrow \textsf{Int}\), rule FunApp is simplified and FunAppCtr becomes \(\textsf {c}_{ res } = \textsf {c}_F[\textsf {c}_{ arg }]\).

figure m
figure n

3.3 Correctness of the Reduction to Arrays

Correctness of the ARS is given by four properties: finiteness of the models, compliance to the target SMT theories, termination of any reduction sequence, and soundness of the reductions. These properties have their correctness sketched for the constants encoding in [13], with detailed proofs present in [26]. Since we rely on the existing ARS and restrict our changes to mainly affect constraint generation, we have the same degree of overapproximation and the correctness arguments made for the constants encoding are in large part valid for the arrays encoding. We present below the definition of a KerA\(^+\) model and detail, for each property, how the use of arrays affects the correctness arguments and how they can be adjusted to remain valid.

Models. Every satisfiable KerA\(^+\) formula has a model \(\mathcal {M}= \langle \mathcal {D}, \mathcal {I}\rangle \), where \(\mathcal {D}\) is the model domain, consisting of a disjoint union of sets \(\mathcal {D}_1, ..., \mathcal {D}_n\), with \(\mathcal {D}_i\), \(1 \le i \le n\), containing the values for type \(\tau _i\), and \(\mathcal {I}\) is the model interpretation, consisting of assignments of domain values to KerA\(^+\) constants. Models are used to access cell values, with the value of a KerA\(^+\) expression e in model \(\mathcal {M}\) being \(\llbracket {e} \rrbracket ^{\mathcal {M}_{}}\). In , we go from \(\mathcal {M}_{ before }\) to \(\mathcal {M}_{ after }\), with \(\mathcal {M}_{ after }\) containing the interpretation of additional constants and being thus an extension of \(\mathcal {M}_{ before }\).

Finiteness. This property states that every interpretation of a KerA\(^+\) expression is defined only over finite values. Its proof is derived from the finiteness of the elements being modelled. In the arrays encoding, we potentially use arrays with infinite sorts, e.g., the integers, but all SMT interpretations that can be derived from such arrays are finite, since we encode only finite TLA\(^+\) data structures. This guarantees finiteness of all KerA\(^+\) models in the arrays encoding.

Theory Compliance. This property states that any sequence of states has the formulas \({\mathrm \Phi }_i\), \(1 \le i \le n\), in the first-order logic fragment containing only quantifier-free expressions over uninterpreted functions and integer arithmetic. Its proof is done by induction on the constraints generated. The constraint \({\mathrm \Phi }_0\) is always \(\textsf{true}\) and is thus trivially compliant. The inductive case is proved by showing that the constraint added by each rule are compliant. The rules in the arrays encoding only add array constraints, in addition to constraints supported by the constants encoding, so theory compliance is straightforward to guarantee.

Termination. This property states that every sequence of ARS reductions is finite, i.e., the reduction process always terminates. Its proof is based on ensuring that every rule r applied to a given state \({ s _{ before }}\) yields a state \({ s _{ after }}\) with \(e_{ after }\) being smaller than \(e_{ before }\). An expression’s length is given based on the length of its sub-expressions. The arrays encoding mainly changes constraint generation, and in the cases where rules are slightly modified they generate resulting expressions of the same size, thus guaranteeing termination.

Soundness. This property is described in Theorem 1. Both e and \({\mathrm \Phi }\) are KerA\(^+\) expressions, but \({\mathrm \Phi }\) is in the first-order logic fragment supported by SMT solvers. Fundamentally, the ARS is rewriting a formula to forward it to the solver. The soundness proof consists of case analysis of each reduction rule to establish that \(e_{ before } \wedge {\mathrm \Phi }_{ before }\) is equisatisfiable to \(e_{ after } \wedge {\mathrm \Phi }_{ after }\), no matter the rule applied in . The case analysis, which describes how \(e_{ after }\) and \({\mathrm \Phi }_{ after }\) can be derived from \(e_{ before }\) and \({\mathrm \Phi }_{ before }\) for each rule, relies on six invariants of the reduction system. Three invariants, 1, 3, and 4, are encoding independent, and thus are the same as in [13], the remaining three, 2, 5, and 6, are changed due to the new representation of sets and functions. Below we show all six invariants, with the modifications needed to guarantee soundness for the arrays encoding.

Theorem 1

Let be a sequence of states produced by the ARS, with \({ s _{i}} = \big <{e_i}\mid {{\mathcal A}_i}\mid {\nu _i}\mid {{\mathrm \Phi }_i}\big>\) and \(1 \le i \le n\). Assume that \(e_0\) is a formula, i.e., it has type \(\textsf{Bool}\). Then \(e_0\) is satisfiable iff the conjunction \(e_n \wedge {\mathrm \Phi }_n\) is satisfiable.

Invariant 1

(type correctness) In every reachable state \(\big <{e}\mid {{\mathcal A}}\mid {\nu }\mid {{\mathrm \Phi }}\big>\) of the ARS, the expression e is well typed.

Invariant 2

(arena membership) In every reachable state \(\big <{e}\mid {{\mathcal A}}\mid {\nu }\mid {{\mathrm \Phi }}\big>\) of the ARS, every cell \(\textsf {c}\) in either the expression e or the formula \({\mathrm \Phi }\) is also in \({\mathcal A}\).

Invariant 3

(model suitability) Let be a reachable transition in the ARS, and \(\mathcal {M}_{ before }\) be a suitable model for \({ s _{ before }}\). An extended model \(\mathcal {M}_{ after }\) from \(\mathcal {M}_{ before }\) is suitable for \({ s _{ after }}\).

Invariant 4

(overapproximation) Let \(\big <{e}\mid {{\mathcal A}}\mid {\nu }\mid {{\mathrm \Phi }}\big>\) be a reachable state of the ARS, and \(\mathcal {M}\) be its model. Assume that \(\textsf {c}_{ set }\) is a set cell in the arena \({\mathcal A}\) and that \(\textsf {c}_{ set } \smash {\rightarrow _{}} \textsf {c}_1, \dots , \textsf {c}_n\) are edges in \({\mathcal A}\), for some \(n \ge 0\). Then, it holds that \(\llbracket {\textsf {c}_{ set }} \rrbracket ^{\mathcal {M}_{}} \subseteq \{\llbracket {\textsf {c}_{1}} \rrbracket ^{\mathcal {M}_{}}, ..., \llbracket {\textsf {c}_{n}} \rrbracket ^{\mathcal {M}_{}}\}\).

Invariant 5

(function domain) Let \(\big <{e}\mid {{\mathcal A}}\mid {\nu }\mid {{\mathrm \Phi }}\big>\) be a reachable state of the ARS. Assume that \(\textsf {c}_f\) is a function cell of type \({\textsf{s}_{\tau _1}} \rightarrow {\textsf{s}_{\tau _2}}\) in the arena \({\mathcal A}\). Then, there is a cell \(\textsf {c}_{ dom }\) of type \({\textsf{s}_{\textsf{Set}[\tau _1]}}\) such that \(\textsf {c}_f \smash {\xrightarrow {1}_{{\mathcal A}}} \textsf {c}_{ dom }\).

Invariant 6

(domain reduction) Let \(\big <{e}\mid {{\mathcal A}}\mid {\nu }\mid {{\mathrm \Phi }}\big>\) be a reachable state of the ARS, and \(\mathcal {M}\) be its model. Assume that \(\textsf {c}_f\) is a function cell and that \(\textsf {c}_f \smash {\xrightarrow {1}_{}} \textsf {c}_{F_{ dom }}\) is in the arena \({\mathcal A}\). Then, it follows that \(\llbracket {\textsf {c}_{F_{ dom }}} \rrbracket ^{\mathcal {M}_{}} = \llbracket {\textsc {domain}f} \rrbracket ^{\mathcal {M}_{}}\).

As described in sections 3.1 and 3.2, arrays precisely model TLA\(^+\) sets and functions. The handling of sets revolves around membership constraints of form \(\textsf {c}_{ set }[\textsf {c}_i]\), which and can be set to \(\textsf{true}\) or \(\textsf{false}\) via store. Regarding functions, function application and update are trivially equivalent to array access and update. The more elaborate array operators also have a counterpart in TLA\(^+\). Constant arrays are equivalent to a function definition for which all range values are the same constant, and array map is equivalent to set map. These equivalences explain how the changes in the arrays encoding do not invalidate the case analysis of the reduction rules used to prove Theorem 1, thus guaranteeing soundness.

4 Evaluation

In order to evaluate the performance impact of the arrays-based encoding, we implemented it in the Apalache model checker, which currently supports the constants encoding. Given a TLA\(^+\) specification containing a property P, Apalache is capable of performing bounded model checking up to a length k and, if P is an inductive invariant, it can check if the property holds with an unbounded length. In both modes, Apalache checks if the SMT formula encoding the specification is satisfiable when conjoined with \(\lnot P\), and if that is the case a counterexample (CEX) in the form of a trace is produced using the arena information and the satisfiable assignment provided by the SMT solver. Our implementation adds new reduction rules to Apalache, which can be enabled via a CLI flag. When enabled, these rules replace the existing ones encoding sets and functions, as described in Section 3. In addition, we also extended Apalache’s CEX generation to handle assignments to SMT formulas containing arrays. We use Z3 [7] as our back-end solver. Apalache is open-source and freely availableFootnote 1.

We performed a number of experiments using Apalache and the explicit-state model checker TLC. For Apalache, we evaluated both its existing constants encoding and two versions of the arrays encoding we propose, called arrays and funArrays. The arrays version encodes both TLA\(^+\) sets and functions as arrays, while the funArrays version encodes only TLA\(^+\) functions as arrays. The purpose of having two versions of our encoding is to evaluate the impact of encoding sets and functions as arrays separately. Our evaluation setup consisted of a machine with 64 AMD EPYC 7452 processors and 256 GB of memory. We first present the benchmarks used and then discuss the results obtained.

4.1 Benchmarks

We consider the TLA\(^+\) specifications of three asynchronous protocols as benchmarks. The first benchmark is a specification of the asynchronous Byzantine agreement protocol by Bracha and Toueg [5], showed in a simplified version in Figure 2, to which we refer as aba. The second benchmark is a specification of the consensus algorithm with Byzantine faults in one communication step by Dobre and Suri [9], to which we refer as cab. The third benchmark is a specification of the asynchronous non-blocking atomic commitment protocol by Guerraoui [12], to which we refer as nac. The common use of aba and cba is in replication scenarios with \(N=3F+1\) replica nodes to tolerate F failures, while the nac protocol is typically used for partitioned databases. The specifications are available onlineFootnote 2.

4.2 Results

For each specification we check a variation of the agreement property. The results are shown in Figure 4. We can see that both arrays and funArrays scale in performance better that the constants encoding, with an order of magnitude improvement for some instances. It is also worth pointing out that arrays and funArrays were able to reach a result before the time limit in 29 and 28 instances, respectively, while the constants encoding was able to do so in only 20 instances. In regards to TLC, it performed worse than the three Apalache encodings in the nontrivial cases, only reaching a result before the time limit in 8 instances.

Fig. 4.
figure 4

Time in checking agreement for aba, cab, and nac. Specifications were ran in two configurations, one in which agreement is expected to hold (OK) and one in which it is not (NotOK). Instance size stands for the number of nodes used, and the time is given in seconds in logarithmic scale; Timeout (TO) is 1 hour.

5 Related Work

An extensive discussion of works related to symbolic model checking for TLA\(^+\) can be found in [13]. Here we focus exclusively on closely related publications. The IVy Prover [20] was designed to tackle verification of distributed algorithms with a decidable fragment of relational first-order logic. Some distributed algorithms, such as the one in Figure 2, cannot be directly expressed in this fragment however, due to the use of power sets and set cardinalities. Recent efforts have focused on offering support to reason about set cardinalities [4], but limitations remain. Cut-off based techniques to automatically infer invariants of distributed algorithms in the IVy language, such as relational abstractions of Paxos and two-phase commit, have been recently proposed [10, 11]. Similar benchmarks are used in [22] to infer generalized invariants from finite instances of TLA\(^+\) and semi-automatically prove invariants with TLAPS. Specifications of fault-tolerant distributed algorithms encoded as threshold automata can be efficiently verified with ByMC [15, 24]. The manual rewriting of an algorithm into threshold automata is, however, usually beyond the skills of a typical TLA\(^+\) user. The work closest to ours involves the use of SMT arrays to encode EventB and TLA\(^+\) specifications in ProB [21]. The focus on ProB aims at handling infinite data structures, in contrast to our choice to work with bounded overapproximations. Reasoning about infinite domains implies the use of quantifiers, which prevents the use of efficient decision procedures available for the decidable fragment of SMT, with this approach been shown to underperform when compared against Apalache in checking the benchmarks from [13]. An important last point to mention is that CVC5 has its own non-standard SMT theory of sets [1]. This theory, however, cannot currently handle nested sets, which is a very commonly used TLA\(^+\) construct. It remains as a viable alternative to the SMT theory of arrays for the encoding of flat sets, but whose use implies important restrictions to the input language and, consequentially, to practical application.

6 Conclusions

We propose an encoding of the main TLA\(^+\) constructs into the SMT theory of arrays, with the goal of providing the SMT solver with the structural information it needs to efficiently reach a solution. We implemented our encoding into the Apalache model checker and our evaluation indicates that our arrays-based encoding provides a significant performance improvement when compared against Apalache’s existing SMT encoding and the explicit-state model checker TLC. Encoding the remaining TLA\(^+\) constructs in a structure-preserving way, be it via SMT arrays or algebraic datatypes, remains an interesting research avenue.