1 Introduction

Many applications in hardware and software verification rely on bit-precise reasoning, which can be modeled using the SMT-LIB 2 theory of fixed-width bit-vectors [3]. While Satisfiability Modulo Theories (SMT) solvers are able to reason about bit-vectors of fixed width, they currently require all widths to be expressed concretely (by a numeral) in their input formulas. For this reason, they cannot be used to prove properties of bit-vector operators that are parametric in the bit-width, such as the associativity of bit-vector concatenation. Proof assistants such as Coq  [25], which have direct support for dependent types, are better suited for such tasks.

Bit-vector formulas that are parametric in the bit-width arise in the verification of parametric Boolean functions and circuits (see, e.g., [13]). In our case, we are mainly interested in parametric lemmas that are relevant to internal techniques of SMT solvers for the theory of fixed-width bit-vectors. These include, for example, rewrite rules, refinement schemes, and preprocessing passes. Such techniques are developed a priori for every possible bit-width. Meta-reasoning about the correctness of such solvers then requires bit-width independent reasoning.

In this paper, we focus on parametric lemmas that originate from a quantifier-instantiation technique implemented in the SMT solver cvc5  [2]. This technique is based on invertibility conditions [15]. For a trivial case of an invertibility condition, consider the equation \(x + s = t \). where x, s and t are variables of the same bit-vector sort. In the terminology of Niemetz et al. [15], this equation is “invertible for x.” A general inverse, or “solution," is given by the term \(t - s \). Since there is always such an inverse, the invertibility condition for \(x + s = t \) is simply the universally true formula \(\top \). The formula stating this fact, referred to here as an invertibility equivalence, is \(\top \Leftrightarrow \exists x.\ {x + s = t}\), which is valid in the theory of fixed-width bit-vectors, for any bit-width. In contrast, the equation \(x \cdot s = t \) is not always invertible for x. A necessary and sufficient condition for invertibility in this case was found in [15] to be \( (- s \mid s) \mathrel { \& } t = t \). So, the invertibility equivalence \( {(- s \mid s) \mathrel { \& } t = t} \Leftrightarrow \exists x.\ {x \cdot s = t}\) is valid for any bit-width. Notice that the invertibility condition does not contain x. Hence, invertibility conditions can be seen as a technique for quantifier elimination.

Table 1. The signatures \(\varSigma _{1}\) and \(\varSigma _{0}\) with SMT-LIB 2 syntax. \(\varSigma _{1}\) consists of the operators in the entire table. \(\varSigma _{0}\) consists of the operators in the upper part.

In [15], a total of 160 invertibility conditions were provided. However, they were verified only for bit-widths up to 65, due to the reasoning limitations of SMT solvers mentioned earlier. Recent work [16, 17] addresses this challenge by translating the invertibility equivalences to the combined theory of non-linear integer arithmetic and uninterpreted functions. This approach was partially successful, but failed to verify over a quarter of the equivalences.

We verify invertibility equivalences proposed in [15] by proving them interactively in Coq. From a representative subset of the invertibility equivalences, we prove 19 equivalences, 12 of which were not proven in [16, 17]. For the remaining 7, that were already proved there, our Coq proofs provide more confidence. Our results offer evidence that proof assistants can support automated theorem provers in meta-verification tasks. To facilitate the verification of invertibility equivalences, we use a rich Coq library for bit-vectors, which is a part of the SMTCoq project [10]. This Coq library models the theory of fixed-width bit-vectors adopted by the SMT-LIB 2 standard [3]. For this work, we extended the library with the arithmetic right-shift operation and the unsigned weak less-than and greater-than predicates. To summarize, the contributions of this paper are as follows: (i) a description of the SMTCoq bit-vector library; (ii) extensions to the signature and proofs of the library; and (iii) formal proofs in Coq of invertibility equivalences. These contributions, while important in their own right, have the potential to go beyond the verification of invertibility equivalences. For (i) and (ii), we envision that the library, as well as its extension, will be useful for the formalization of other bit-precise reasoning mechanisms, especially related to SMT, such as rewriting rules, lemma schemas, interactive verification, and more. For (iii), invertibility conditions are primarily used for quantifier instantiation (see, e.g., [15]). We hope that the increased confidence in their correctness will encourage their usage in other contexts and in more solvers. Further, the formal proofs can serve as guiding examples for other proofs related to bit-precise reasoning.

The remainder of this paper is organized as follows. After technical preliminaries in Sect. 2, we formalize invertibility conditions in Sect. 3 and discuss previous attempts at verifying them. In Sect. 4, we describe the Coq library and our extensions to it. In Sect. 5, we discuss our Coq proofs. We conclude in Sect. 6 with directions for future work. A preliminary version of this work was presented as an extended abstract in the proceedings of the PxTP 2019 workshop [11]. The current version is more detailed and complete. In particular, the one Coq proof that was missing in [11] is now completed.

2 Preliminaries

2.1 Theory of Bit-Vectors

We assume the usual terminology of many-sorted first-order logic with equality (see, e.g., [12]). We denote equality by \(=\), and use \(x\ne y\) as an abbreviation for \(\lnot (x=y)\). The signature \(\varSigma _{BV} \) of the SMT-LIB 2 theory of fixed-width bit-vectors defines a unique sort for each positive integer n, which we denote by \(\sigma _{[n]} \). For every positive integer n and bit-vector of width n, the signature contains a constant symbol of sort \(\sigma _{[n]} \), representing that bit-vector, which we denote as a binary string of length n. The function and predicate symbols of \(\varSigma _{BV} \) are as described in the SMT-LIB 2 standard. Formulas of \(\varSigma _{BV} \) are built from variables, bit-vector constants, and the function and predicate symbols of \(\varSigma _{BV} \), along with the usual logical connectives and quantifiers. We write \(\psi [x_{1},\ldots ,x_{n}]\) to represent a formula whose free variables are from the set \(\{x_{1},\ldots ,x_{n}\}\).

The semantics of \(\varSigma _{BV} \)-formulas is given by interpretations where the domain of \(\sigma _{[n]} \) is the set of bit-vectors of width n, and the function and predicate symbols are interpreted as specified by the SMT-LIB 2 standard. A \(\varSigma _{BV} \)-formula is valid in the theory of fixed-width bit-vectors if it is satisfied by every such interpretation.

Table 1 contains the operators from \(\varSigma _{BV} \) for which invertibility conditions were defined in [15]. We define \(\varSigma _{1}\) to be the signature that contains only these symbols. \(\varSigma _{0}\) is the sub-signature obtained by only taking the operators from the upper part of the table. We use the (overloaded) constant 0 to represent the bit-vectors composed of all 0-bits.

2.2 Coq

The Coq proof assistant is based on the calculus of inductive constructions (CIC) [20]. It implements properties as types, and proofs as terms, reducing proof-checking to type-checking. Coq has a rich type system, that allows for highly expressive propositions to be stated and proved in this manner. One particular feature of interest is that of dependent types — types that can depend on values — through which one can express correctness properties within types. We refer to non-dependent types as simple types.

The Coq module system — in addition to allowing for principled separations of large developments — allows the abstraction of complex types along with operations over them as modules. A module signature or module type acts as an interface to a module, specifying the type it encapsulates along with the signatures of the associated operators. A functor is a module-to-module function.

3 Invertibility Conditions and Their Verification

In [15], a technique to solve quantified bit-vector formulas is presented, which is based on invertibility conditions.

Definition 1

An invertibility condition for a variable x in a \(\varSigma _{BV} \)-literal \(\ell [x,s,t]\) is a formula IC[st] such that \(\forall s.\forall t.\ IC[s,t] \Leftrightarrow \exists x.\ \ell [x,s,t]\) is valid in the theory of fixed-width bit-vectors.

Example 1

The invertibility condition for x in \( x \mathrel { \& } s = t\) is \( t \mathrel { \& } s = {t}\).   \(\square \)

In [15], invertibility conditions are defined for a representative set of literals \(\ell \) over the bit-vector operators of \(\varSigma _{1}\), having a single occurrence of x. The soundness of the technique proposed in that work relies on the correctness of the invertibility conditions. Every literal \(\ell [x,s,t]\) and its corresponding invertibility condition IC[st] induce an invertibility equivalence.

Definition 2

The invertibility equivalence associated with the literal \(\ell [x,s,t]\) and its invertibility condition IC[st] is the formula

$$\begin{aligned} IC[s,t]\Leftrightarrow \exists x.\ \ell [x,s,t] \end{aligned}$$
(1)

The correctness of invertibility equivalences should be verified for all possible sorts for the variables xst for which the condition is well sorted. Concretely, one needs to prove the validity of the following formula:

$$\begin{aligned} \forall n:\mathbb {N}.\ n>0\Rightarrow \,\forall s:\sigma _{[n]}.\forall t:\sigma _{[n]}.\ IC[s,t] \Leftrightarrow \exists x:\sigma _{[n]}.\ \ell [x,s,t] \end{aligned}$$
(2)

This was done in [15], but only for concrete values of n from 1 to 65, using solvers for the theory of fixed-width bit-vectors. In contrast, Eq. (2) cannot even be expressed in this theory. To overcome this limitation, later work suggested a translation from bit-vector formulas over parametric bit-widths to the theory of non-linear integer arithmetic with uninterpreted functions [16, 17]. Thanks to this translation, the authors were able to verify the correctness of 110 out of 160 invertibility equivalences. For the remaining 50 equivalences, it then seems appropriate to use a proof-assistant, as this allows for more intervention by the user who can provide crucial intermediate steps. Even for the 110 invertibility equivalences that were proved, the level of confidence achieved by proving them in a proof assistant would be greater than an automatic verification by an SMT solver due to the smaller trusted code-base of proof assistants in relation to those of automatic theorem provers such as SMT solvers.

Fig. 1.
figure 1

The level of confidence achieved by the different approaches.

Figure 1 depicts the level of confidence achieved by the various approaches to verify invertibility equivalences. The smallest circle, labelled auto-65, represents the approach taken by [15], where invertibility equivalences were verified automatically up to 65 bits. While a step in the right direction, this approach is insufficient, because invertibility conditions are used for arbitrary bit-widths. The next circle, labeled auto-ind, depicts the approach of [17], which addresses the restrictions of auto-65 by providing bit-width independent proofs of the invertibility equivalences. However, both auto-65 and auto-ind provide proofs by SMT solvers, which are less trusted than ITPs. The largest circle (Coq) corresponds to work presented in the current paper which, while addressing the limitations of auto-65 via bit-width independent proofs, also provides stronger verification guarantees by proving the equivalences in an interactive theorem prover. Moreover, with this approach, we were able to prove equivalences that couldn’t be fully verified (for arbitrary bit-widths) by either auto-65 or auto-ind.

4 The BVList Library

In this section, we describe the Coq library we use and the extensions we developed with the goal of formalizing and proving invertibility equivalences. Various formalizations of bit-vectors in Coq exist. The internal Coq library of bit-vectors [9] is one, but it has only definitions and no lemmas. The Bedrock Bit Vectors Library [6] treats bit-vectors as words (machine integers). The SSRBit Library [5] represents bit-vectors as finite bit-sets in Coq and extracts them to OCaml machine integers. Our library is more suited to the SMT-LIB 2 bit-vectors, and includes operators that are not fully covered by any of the previously mentioned libraries. More recently, Shi et al. [22] developed a library called CoqQFBV that presents a bit-vector type as a sequence of Booleans, defines operators over it, and proves the correctness of these operations with respect to a (machine integer) semantics. [22] uses this library to define a bit-blasting algorithm in Coq, that is extracted into an OCaml program to perform certified bit-blasting. Since CoqQFBV covers the entire SMT-LIB 2 bit-vector signature, it would be a good alternative to ours in formalizing and proving invertibility conditions. Our library offers a rich set of lemmas over bit-vector operations that makes it suitable for proofs of invertibility conditions and other bit-vector properties. Bit-vectors have also been formalized in other proof assistants. Within the Isabelle/HOL framework, one can utilize the library developed by Beeren et al. [4] to align with SMT-LIB 2 bit-vector operations. Furthermore, Harrison [1] presents a formalization of finite-dimensional Euclidean space within HOL light, accompanied by an implementation of vectors.

4.1 BVList Without Extensions

BVList was developed for SMTCoq  [10], a Coq plugin that enables Coq to dispatch proofs to external proof-producing solvers. While the library was only briefly mentioned in [10], here we provide more details.

The library adopts the little-endian notation for bit-vectors, following the internal representation of bit-vectors in SMT solvers such as cvc5, and corresponding to lists in Coq. This makes arithmetic operations easier to perform since the least significant bit of a bit-vector is the head of the Boolean list that represents it.

Another choice is how to formalize the bit-vector type. A dependently-typed definition is natural, since then the type of a bit-vector is parameterized by its length. However, such a representation leads to some difficulties in proofs. Dependent pattern-matching or case-analysis with dependent types is cumbersome and unduly complex (see, e.g., [23]), because of the complications brought by unification in Coq (which is inherently undecidable [24]). A simply-typed definition, on the other hand, does not provide such obstacles for proofs, but is less natural, as the length becomes external to the type. The BVList library defines for convenience both the dependently and the simply typed version of bit-vectors. It uses the Coq module system to separate them, and a functor that connects them, avoiding redundancy. The relationship between the two definitions is depicted in Fig. 2.

In BVList, a dependently-typed bit-vector is a record parameterized by its size n and consisting of two fields: a Boolean list and a condition to ensure that the list has length n. This type, and the corresponding lemmas and properties over it, are encapsulated by the BITVECTOR_LIST module of type BITVECTOR. A simply-typed or raw bit-vector representation is simply a Boolean list which, along with its associated operators and lemmas is specified by module signature RAWBITVECTOR and implemented in module RAWBITVECTOR_LIST. In other words, the interface of BVList offers dependently-typed bit-vectors, while the underlying operators are defined and proofs are performed using raw bit-vectors.

Fig. 2.
figure 2

Modular separation of BVList

A functor called RAW2BITVECTOR derives corresponding definitions and proofs over dependently-typed bit-vectors within the module for dependent-types, when it is applied to RAWBITVECTOR_LIST. The functor establishes a correspondence between the two theories so that one can first prove a bit-vector property in the context of the simply-typed theory and then map it to its corresponding dependently-typed one via the functor module. Otherwise put, users of the library can encode theorem statements more naturaly, and in a more expressive environment employing dependent types. For proofs, one can unlift them (by the functor) to the equivalent encodings with simple types, and prove them there.

4.2 Extending BVList

Out of the 13 bit-vector functions and 10 predicates contained in \(\varSigma _{1}\), BVList had direct support for 10 functions and 6 predicates. The predicate symbols that were not directly supported were the weak inequalities \(\le _u\), \(\ge _u\), \(\le _s\), \(\ge _s\) and the unsupported function symbols were \(\mathop {>\!>_a}\), \(\div \), and \(\bmod \). We extended BVList with the operator \(\mathop {>\!>_a}\) and the predicates \(\le _u\) and \(\ge _u\) in order to support the corresponding invertibility conditions. Additionally, we redefined \(\mathop {<\!<}\) and \(\mathop {>>}\) in order to simplify the proofs of invertibility conditions over them.Footnote 1

We focused on invertibility conditions for literals of the form \(x\diamond s \bowtie t\) and \(s\diamond x\bowtie t\), where \(\diamond \) and \(\bowtie \) are respectively function and predicate symbols in \(\varSigma _{0}\). \(\varSigma _{0}\) was chosen as a representative set because it is both expressive enough (in the sense that other operators can be easily translated to this fragment), and feasible for proofs in Coq using the library. In particular, it was chosen as one that would require the minimal amount of changes to BVList. As a result, such literals, as well as their invertibility conditions, contain only operators supported by BVList (after its extension with \(\mathop {>\!>_a} \), \(\le _u \), and \(\ge _u \)). Supporting the full set of operators in \(\varSigma _{1}\), both in the library and the proofs is left for future work.

Fig. 3.
figure 3

Definitions of \(\le _u \) in Coq.

In what follows, we describe our extensions to BVList with weak unsigned inequalities, alternative definitions for logical shifts, and the arithmetic right shift operator.

Weak Unsigned Inequalities. We added both weak inequalities for unsigned bit-vectors, \(\le _u \) and \(\ge _u \). We illustrate this extension via that of the \(\le _u\) operator (the extension of \(\ge _u\) is similar). The relevant Coq definitions are provided in Fig. 3. The top three definitions (including the fixpoint) cover the simply-typed representation, and the fourth, bv_ule is the dependently-typed representation that invokes the definition with the same name from module M of type RAWBITVECTOR. Like most other operators, \(\le _u\) (over raw bit-vectors) is defined over a few layers. The function bv_ule, at the highest layer, ensures that comparisons are between bit-vectors of the same size and then calls ule_list. Since we want to compare bit-vectors starting from their most significant bits and the input lists start instead with the least significant bits, ule_list first reverses the two lists. Then it calls ule_list_big_endian, which we consider to be at the lowest layer of the definition. This function does a lexicographic comparison of the two lists, starting from the most significant bits.

To see why the addition of \(\le _u\) to the library is useful, consider, for example, the following parametric lemma, stating that \({\sim }\, \!0 \) is the largest unsigned bit-vector of its type:

$$\begin{aligned} \forall x:\sigma _{[n]}.\ x \le _u {\sim }\, \!0 \end{aligned}$$
(3)

Without an operator for the weak inequality, we would write it as:

$$\begin{aligned} \forall x:\sigma _{[n]}.\ x <_u {\sim }\, \!0 \vee {x = {\sim }\, \!0 } \end{aligned}$$
(4)
Fig. 4.
figure 4

Various definitions of \(\mathop {<\!<}\).

In such cases, since the definitions of \(<_u \) and \(=\) have a similar structure to that of \(\le _u \), we strip down the layers of \(<_u\) and \(=\) separately, whereas using \(\le _u \), we only do this once.

Left and Right Logical Shifts. We have redefined the shift operators \(\mathop {<\!<} \) and \(\mathop {>>} \) in BVList. Figure 4 shows both the original and new definitions of \(\mathop {<\!<}\). Those of \(\mathop {>>}\) are similar. Originally, \(\mathop {<\!<} \) was defined using the shl_one_bit and shl_n_bits. The function shl_one_bit shifts the bit-vector to the left by one bit and is called by shl_n_bits as many times as necessary. The new definition shl_n_bits_a uses mk_list_false which constructs the necessary list of 0 bits and appends (++ in Coq) to it the bits to be shifted from the original bit-vector, which are retrieved using the firstn function, from the Coq standard library for lists. The nat type used in Fig. 4 is the Coq representation of Peano natural numbers that has \(\texttt {0}\) and \(\texttt {S}\) as its two constructors — as depicted in the cases rendered by pattern matching n (lines 9-10). The theorem at the bottom of Fig. 4 asserts the equivalence of the two representations, allowing us to switch between them, when needed. In the extended library, bv_shl defines the left shift operation using shl_n_bits whereas bv_shl_a does it using shl_n_bits_a. This new representation was useful in proving some of the invertibility equivalences over shift operators (see, e.g., Example 4 below).

Arithmetic Right Shift. Unlike logical shifts that were already defined in BVList and for which we have added alternative definitions, arithmetic right shift was not defined at all. We provided two alternative definitions for it, very similar to the definitions of logical shifts — bv_ashr and bv_ashr_a. Both definitions are conditional on the sign of the bit-vector (its most-significant bit). Apart from this detail, the definitions take the same approach taken by shl_n_bits and shl_n_bits_a from Fig. 4. Operator bv_ashr uses the definition of an independent shift and repeats it as many number of times as necessary, and bv_ashr_a uses either mk_list_false or mk_list_true to append the necessary number of sign bits to the shifted bits.

5 Proving Invertibility Equivalences in Coq

In this section we provide specific details about proving invertibility equivalences in Coq. We start by outlining the general approach for proving invertibility equivalences in Sect. 5.1. Then, Sect. 5.2 presents detailed examples of such proofs. Section 5.3 summarizes the results and impact of these proofs.

5.1 General Approach

The natural representation of bit-vectors in Coq is the dependently-typed representation, and therefore the invertibility equivalences are formulated using this representation. In keeping with the modular approach described in Sect. 4, however, proofs in this representation are composed of proofs over simply-typed bit-vectors, which are easier to reason about. Most of the work is on proving an equivalence over raw bit-vectors. Then, we derive the proof of the corresponding equivalence over dependently-typed bit-vectors using a smaller, boilerplate set of tactics. Since this derivation process is mostly the same across many equivalences, these tactics are a good candidate for automation in the future.

When proving an invertibility equivalence \(IC[s,t]\Leftrightarrow \exists x.\ \ell [x,s,t]\), we first split it into two sub-goals: the left-to-right and right-to-left implications. For proving the left-to-right implication, since Coq implements a constructive logic, the only way to prove an existentially quantified formula is to construct the literal witnessing it. Thus, in addition to being able to prove the equivalence, a positive side-effect of our proofs are actual inverses for x in literals of the form \(\ell [x,s,t]\). In Niemetz et al. [16], these are called conditional inverses, as the fact that they are inverses is conditional on the correctness of the invertibility condition. There, such inverses were synthesized automatically for a subset of the literals. In each of our Coq proofs, such an inverse is found, even when the proof is done by case-splitting. This provides a more general solution than the one in [16], which did not consider case-splitting.

Example 2

Consider the literal \(s \mathop {>\!>_a} x \ge _u t \). Its invertibility condition is \((s \ge _u {\sim }\, \!s ) \vee (s \ge _u t)\). The left-to-right implication of the invertibility equivalence is:

$$\begin{aligned} \forall s, t : \sigma _{[n]}.\ (s \ge _u {\sim }\, \!s ) \vee (s \ge _u t)\Rightarrow \exists x : \sigma _{[n]}.\ s \mathop {>\!>_a} x \ge _u t \end{aligned}$$

Here, case splitting is done on the disjunction in the invertibility condition. When \(s \ge _u {\sim }\, \!s \) is true, the inverse for x is the bit-vector constant that correspond to the length of the s, namely n; when \(s \ge _u t \) is true, the inverse is 0.    \(\square \)

In addition to BVList, several proofs of invertibility equivalences benefited from CoqHammer  [7], a plug-in that aims at extending the level of automation in Coq by combining machine learning and automated reasoning techniques in a similar fashion to what is done in by Sledgehammer [21] in Isabelle/HOL [18]. CoqHammer, when triggered on some Coq goal, (i) submits the goal together with potentially useful terms to external solvers/automated-provers, (ii) attempts to reconstruct returned proofs (if any) directly in the Coq tactic language Ltac [8], and (iii) outputs the set of tactics closing the goal in case of success. As we directly employ these tactics inside BVList, one does not need to install CoqHammer in order to build the library, although it would be beneficial for further extensions.

5.2 Detailed Examples

In this section we provide specific examples for proofs of invertibility equivalences. The first example illustrates the two-theories approach of the library.

Example 3

Consider the literal \(s \mathop {>\!>_a} x <_u t \). Its invertibility condition is \(((s<_u t \,\vee \,\lnot ( s <_s 0))\,\wedge \,t \ne 0)\). Figure 5 shows the proof of the following direction of the corresponding invertibility equivalence:

$$\begin{aligned} \forall s, t : \sigma _{[n]}.\ (\exists x:\sigma _{[n]}.\ s \mathop {>\!>_a} x<_u t)\Rightarrow ((s<_u t \,\vee \,\lnot ( s <_s 0))\,\wedge \,t \ne 0) \end{aligned}$$

In the proof, lines 8–11 transform the dependent bit-vectors from the goal and the hypotheses into simply-typed bit-vectors. Then, lines 12–14 invoke the corresponding lemma for simply-typed bit-vectors (called InvCond.bvashr_ult2_rtl) along with some simplifications.   \(\square \)

Most of the effort in this project went into proving equivalences over raw bit-vectors, as the following example illustrates.

Example 4

Consider the literal \(x \mathop {<\!<} s >_u t \). Its invertibility condition is \((t<_u {\sim }\, \!0 \mathop {<\!<} s )\). The corresponding invertibility equivalence is:

$$\begin{aligned} \forall s, t : \sigma _{[n]}.\ (t<_u {\sim }\, \!0 \mathop {<\!<} s ) \Leftrightarrow (\exists x:\sigma _{[n]}.\ x \mathop {<\!<} s >_u t) \end{aligned}$$
(5)

The left-to-right implication is easy to prove using \({\sim }\, \!0 \) itself as the witness of the existential proof goal and considering the symmetry between \(>_u \) and \(<_u \). The proof of the right-to-left implication relies on the following lemma:

$$\begin{aligned} \forall x, s : \sigma _{[n]}.\ (x \mathop {<\!<} s) \le _u ({\sim }\, \!0 \mathop {<\!<} s) \end{aligned}$$
(6)

From the right side of the equivalence in Eq. (5), we get some skolem x for which \(x \mathop {<\!<} s >_u t \) holds. Flipping the inequality, we have that \(t<_u x \mathop {<\!<} s \); using this, and transitivity over \(<_u\) and \(\le _u\), the lemma given by  Eq. (6) gives us the left side of the equivalence in Eq. (5).

As mentioned in Sect. 4, we have redefined the shift operators \(\mathop {<\!<} \) and \(\mathop {>>} \) in the library. This was instrumental, for example, in the proof of Eq. (6).

Fig. 5.
figure 5

A proof of one direction of the invertibility equivalence for \(\mathop {>\!>_a}\) and \(<_u\) using dependent types.

The new definition uses firstn and ++, over which many useful properties are already proven in the standard library. This benefits us in manual proofs, and in calls to CoqHammer, since the latter is able to use lemmas from the imported libraries to prove the goals that are given to it. Using this representation, proving Eq. (6) reduces to proving Lemmas bv_ule_1_firstn and bv_ule_pre_append, shown in Fig. 6. The proof of bv_ule_pre_append benefited from the property app_comm_cons from the standard list library of Coq, whereas firstn_length_le was useful in reducing the goal of bv_ule_1_firstn to the Coq equivalent of Eq. (3). The statements of the properties mentioned from the standard library are also shown in Fig. 6.   \(\square \)

Finally, we examine what was considered a challenge problem in the previous version of this work [11]. The next example details how we completed the proof.

Example 5

Consider the literal \((x \mathop {>>} s) >_u t \). Its invertibility condition is \(t <_u ({\sim }\, \!s \mathop {>>} s) \). Now consider the following direction of the corresponding invertibility equivalence:

$$\begin{aligned} \forall s, t:\sigma _{[n]}.\ t <_u ({\sim }\, \!s \mathop {>>} s) \Rightarrow \exists x:\sigma _{[n]}.\ (x \mathop {>>} s) >_u t \end{aligned}$$
(7)

Figure 7 contains the theorem stating the equivalence, and some lemmas used within its proof. A crucial step in the proof of the implication is to rewrite the definition of the right shift operator bv_shr to its alternate definition bv_shr_a (see Sect. 4.2). Unfolding the alternative definition leads to a case-analysis on the following condition:

$$\texttt {toNat}(s) < \texttt {len}(x) $$

where toNat casts a bit-vector to its natural number representation, and len returns the length of a bit-vector as a natural number.

Fig. 6.
figure 6

Examples of lemmas used in proofs of invertibility equivalences.

The challenge in the proof arises in the positive case of the condition, which reduces to a proof of first_bits_zero (see Fig. 7). first_bits_zero says that given \(\texttt {toNat}(s) < \texttt {len}(s) \), the most-significant \(\texttt {len}(s) - \texttt {toNat}(s) \) bits of s are 0. As seen in Fig. 4, the second argument to the top-most layer of the shift (called from bv_shl_eq) is a bit-vector that specifies the number of times to shift the bit-vector in the first argument. This second argument is converted to a natural number by the abstract toNat function invoked above, the concrete definitions of which are specified in Fig. 7 as list2nat_be_a and list2N. At the same level of abstraction, we use rev for the list reversal function corresponding to the Coq function of the same name, and firstn also for its Coq namesake (firstn n l returns the n most significant bits of l), so that first_bits_zero can be specified as follows:

$$\begin{aligned} \texttt {toNat}(s) < \texttt {len}(s) \Rightarrow \texttt {firstn}\ (\texttt {len}(s) - \texttt {toNat}(s))\ (\texttt {rev}(s)) = 0 \end{aligned}$$

The intuition behind its validity is that if the most-significant \(\texttt {len}(s) - \texttt {toNat}(s) \) bits were not 0 then they would contribute to the value of \(\texttt {toNat}(s) \), making it greater than or equal to \(\texttt {len}(s) \) and thus falsifying the condition. However, it is challenging to convert this intuition into a proof using induction over lists, as explained in what follows.

Fig. 7.
figure 7

Invertibility equivalence for \(\mathop {>>} \) and \(>_u \) and some lemmas used by its proof.

To prove first_bits_zero, we redefined list2N as a tail-recursive function list2NTR. This step was proven to be sound by a lemma of equivalence between the two definitions (list2N_eq). Since list2N is not tail recursive, it only begins computation at the end of the input list representing a bit-vector. Such a definition further complicates the proof of first_bits_zero when based on the typical induction principle over the structure of the Boolean list underlying the bit-vector s. This is because it does not easily reduce (via \(\iota \)-reduction for inductive definitions [19]), into a useful expression in the step case of the intended induction.

The advantage of tail recursion in this context is best illustrated by Fig. 8 where x is a Boolean variable and xs represents an arbitrary Boolean list. The derivation of the goal from the inductive hypothesis (IH) in derivation (8) from Fig. 8 is complicated in Coq because the functions firstn and rev are not well-matched with list2N, if not incompatible. For instance, observe that the in the inductive step (Goal), as the first argument to firstn increases, the number of bits fetched from the list increases towards the right. However, due to the little-endian notation of bit-vectors and the fact that the list cons function (::) can be seen as incrementing its argument list to its left, the rev function must be used to corrects the direction of increase of the second argument to firstn. Despite this correction, an induction over s must deal with two structurally different lists.

In contrast, the tail-recursive definition of list2NTR hides the rev function. This is illustrated in derivation (9) in Fig. 8, where toNatTR corresponds to list2NTR. Furthermore, such an induction over lists using append (++) to the right, rather than cons to the left is possible thanks to the reverse induction principleFootnote 2. Closing such a goal allowed us to prove the list2NTR-variant of first_bits_zero, specified as first_bits_zeroA in Fig. 7, and the proof of equivalence between the two definitions (list2N_eq) allowed us to use this in closing the original goal (7).    \(\square \)

Fig. 8.
figure 8

Sub-goals generated in the proof of first_bits_zero. Note that 0 is a bit-vector constant of the appropriate length (list of falses).

5.3 Results

Table 2 summarizes the results of proving invertibility equivalences for invertibility conditions in the signature \(\varSigma _{0}\). In the table, \(\checkmark \) means that the invertibility equivalence was successfully verified in Coq but not in Niemetz et al. [17], and means the opposite; means that the invertibility equivalence was verified using both approaches. We successfully proved all invertibility equivalences over \(= \) that are expressible in \(\varSigma _{0}\), including 4 that were not proved in [17]. For the rest of the predicates, we focused only on the 8 invertibility equivalences that were not proved in [17], and succeeded in proving all of them.

Our work thus complements [17] in verifying all invertibility conditions in \(\varSigma _{0}\) for arbitrary bit-widths, by proving all 12 equivalences that were previously unverified, and corroborating 7 others that were verified by SMT solvers. It also complements [15], which verified all invertibility conditions in \(\varSigma _{1}\), but only up to bit-width of 65.

Table 2. Proved invertibility equivalences in \(\varSigma _{0}\) where \(\bowtie \) ranges over the given predicate symbols. \(\checkmark \) means that the invertibility equivalence was successfully verified in Coq but not in [17], whereas means the opposite; means that the invertibility equivalence was verified using both approaches.

6 Conclusion and Future Work

We have described our work on verifying bit-vector invertibility conditions in the Coq proof assistant, which required extending the BVList library in Coq. In addition to describing the library and our extensions to it, this paper presented details about the Coq proofs of the invertibility equivalences. These were done on a representative subset of the operators from the theory of bit-vectors that is well-supported by the extended library. We were able to prove in Coq all the equivalences that were left unproven in previous attempts for all bit-widths, and also to prove in Coq some equivalences that were proven automatically before, thus increasing confidence in their correctness.

The most immediate direction for future work is proving more of the invertibility equivalences supported by the bit-vector library. In addition, we plan to extend the library so that it supports the full syntax in which invertibility conditions are expressed, namely \(\varSigma _{1}\). This will also increase the potential usage of the library for other applications. Another direction for future work is to extend the proofs for invertibility conditions where some of the bits are known. Such invertibility conditions were introduced by Niemetz and Preiner [14]. However, their formal verification for every bit-width is yet to be done.