1 Introduction

Traditionally, automated reasoning techniques for integers focus on polynomial arithmetic. This is not only true in the context of SMT, but also for program verification techniques, since the latter often search for polynomial invariants that imply the desired properties. As invariants are over-approximations, they are well suited for proving “universal” properties like safety, termination, or upper bounds on the worst-case runtime that refer to all possible program runs. However, proving dual properties like unsafety, non-termination, or lower bounds requires under-approximations, so that invariants are of limited use here.

For lower bounds, an infinite set of witnesses is required, as the runtime w.r.t. a finite set of (terminating) program runs is always bounded by a constant. Thus, to prove non-constant lower bounds, symbolic under-approximations are required, i.e., formulas that describe an infinite subset of the reachable states. However, polynomial arithmetic is often insufficient to express such approximations. To see this, consider the program

$$ x \leftarrow 1;\ y \leftarrow \texttt {nondet}(0,\infty );\ \textbf{while}\ y > 0\ \textbf{do}\ x \leftarrow 3 \cdot x;\ y \leftarrow y - 1\ \textbf{done} $$

where \(\texttt {nondet}(0,\infty )\) returns a natural number non-deterministically. Here, the set of reachable states after execution of the loop is characterized by the formula

$$\begin{aligned} \exists n \in \mathbb {N}.\ x = 3^n \wedge y = 0. \end{aligned}$$
(1)

In recent work, acceleration techniques have successfully been used to deduce lower runtime bounds automatically [17, 18]. While they can easily derive a formula like (1) from the code above, this is of limited use, as mostFootnote 1 SMT solvers cannot handle terms of the form \(3^n\). Besides lower bounds, acceleration has also successfully been used for proving non-termination [15, 18, 19] and (un)safety [3, 6, 7, 20, 28, 29], where its strength is finding long counterexamples that are challenging for other techniques.

Importantly, exponentiation is not just “yet another function” that can result from applying acceleration techniques. There are well-known, important classes of loops where polynomials and exponentiation always suffice to represent the values of the program variables after executing a loop [16, 26]. Thus, the lack of support for integer exponentiation in SMT solvers is a major obstacle for the further development of acceleration-based verification techniques.

In this work, we first define a novel SMT theory for integer arithmetic with exponentiation. Then we show how to lift standard SMT solvers to this new theory, resulting in our novel tool SwInE (SMT with Integer Exponentiation).

Our technique is inspired by incremental linearization, which has been applied successfully to real arithmetic with transcendental functions, including the natural exponential function \(\textsf {exp}_e(x) = e^x\), where e is Euler’s number [11]. In this setting, incremental linearization considers \(\textsf {exp}_e\) as an uninterpreted function. If the resulting SMT problem is unsatisfiable, then so is the original problem. If it is satisfiable and the model that was found for \(\textsf {exp}_e\) coincides with the semantics of exponentiation, then the original problem is satisfiable. Otherwise, lemmas about \(\textsf {exp}_e\) that rule out the current model are added to the SMT problem, and then its satisfiability is checked again. The name “incremental linearization” is due to the fact that these lemmas only contain linear arithmetic.

The main challenge for adapting this approach to integer exponentiation is to generate suitable lemmas, see Sect. 4.2. Except for so-called monotonicity lemmas, none of the lemmas from [11] easily carry over to our setting. In contrast to [11], we do not restrict ourselves to linear lemmas, but we also use non-linear, polynomial lemmas. This is due to the fact that we consider a binary version \(\lambda x, y.\ x^y\) of exponentiation, whereas [11] fixes the base to e. Thus, in our setting, one obtains bilinear lemmas that are linear w.r.t. x as well as y, but may contain multiplication between x and y (i.e., they may contain the subterm \(x \cdot y\)). More precisely, bilinear lemmas arise from bilinear interpolation, which is a crucial ingredient of our approach, as it allows us to eliminate any model that violates the semantics of exponentiation (Theorem 23). Therefore, the name “incremental linearization” does not fit to our approach, which is rather an instance of “counterexample-guided abstraction refinement” (CEGAR) [13].

To summarize, our contributions are as follows: We first propose the new SMT theory \(\text {EIA} \) for integer arithmetic with exponentiation (Sect. 3). Then, based on novel techniques for generating suitable lemmas, we develop a CEGAR approach for \(\text {EIA}\) (Sect. 4). We implemented our approach in our novel open-source tool SwInE [22, 23] and evaluated it on a collection of 4627 \(\text {EIA}\) benchmarks that we synthesized from verification problems. Our experiments show that our approach is highly effective in practice (Sect. 6). All proofs can be found in [21].

2 Preliminaries

We are working in the setting of SMT-LIB logic [4], a variant of many-sorted first-order logic with equality. We now introduce a reduced variant of [4], where we only explain those concepts that are relevant for our work.

In SMT-LIB logic, there is a dedicated Boolean sort \({\textbf {Bool}}\), and hence formulas are just terms of sort \({\textbf {Bool}}\). Similarly, there is no distinction between predicates and functions, as predicates are simply functions of type \({\textbf {Bool}}\).

So in SMT-LIB logic, a signature \(\varSigma = (\varSigma ^S,\varSigma ^F,\varSigma ^R)\) consists of a set \(\varSigma ^S\) of sorts, a set \(\varSigma ^F\) of function symbols, and a ranking function \(\varSigma ^R: \varSigma ^F \rightarrow (\varSigma ^S)^+\). The meaning of \(\varSigma ^R(f) = (s_1,\ldots ,s_k)\) is that f is a function which maps arguments of the sorts \(s_1,\ldots ,s_{k-1}\) to a result of sort \(s_k\). We write \(f: s_1\ \ldots \ s_k\) instead of “\(f \in \varSigma ^F\) and \(\varSigma ^R(f) = (s_1,\ldots ,s_k)\)” if \(\varSigma \) is clear from the context. We always allow to implicitly extend \(\varSigma \) with arbitrarily many constant function symbols (i.e., function symbols x where \(|\varSigma ^R(x)| = 1\)). Note that SMT-LIB logic only considers closed terms, i.e., terms without free variables, and we are only concerned with quantifier-free formulas, so in our setting, all formulas are ground. Therefore, we refer to these constant function symbols as variables to avoid confusion with other, predefined constant function symbols like \({\textbf {true}} , 0, \ldots \), see below.

Every SMT-LIB signature is an extension of \(\varSigma _{\textbf {Bool}}\) where \(\varSigma ^S_{\textbf {Bool}}= \{{\textbf {Bool}}\}\) and \(\varSigma ^F_{\textbf {Bool}}\) consists of the following function symbols:

$$ {\textbf {true}} , {\textbf {false}} : {\textbf {Bool}}\qquad \lnot : {\textbf {Bool}}\ {\textbf {Bool}}\qquad \wedge , \vee , \implies , \iff : {\textbf {Bool}}\ {\textbf {Bool}}\ {\textbf {Bool}}$$

Note that SMT-LIB logic only considers well-sorted terms. A \(\varSigma \)-structure \(\textbf{A}\) consists of a universe \(A = \bigcup _{s \in \varSigma ^S} A_s\) and an interpretation function that maps each function symbol \(f: s_1\ \ldots \ s_k\) to a function . SMT-LIB logic only considers structures where \(A_{\textbf {Bool}}= \{{\textbf {true}} ,{\textbf {false}} \}\) and all function symbols from \(\varSigma _{\textbf {Bool}}\) are interpreted as usual.

A \(\varSigma \)-theory is a class of \(\varSigma \)-structures. For example, consider the extension \(\varSigma _{\textbf {Int}}\) of \(\varSigma _{\textbf {Bool}}\) with the additional sort \({\textbf {Int}}\) and the following function symbols:

$$\begin{aligned} 0,1, \ldots : {\textbf {Int}}\qquad +,-,\cdot ,\mathrel {\textsf {div}},\mathrel {\textsf {mod}}: {\textbf {Int}}\ {\textbf {Int}}\ {\textbf {Int}}\qquad <, \le , >, \ge , =, \ne : {\textbf {Int}}\ {\textbf {Int}}\ {\textbf {Bool}}\end{aligned}$$

Then the \(\varSigma _{\textbf {Int}}\)-theory non-linear integer arithmetic (\(\text {NIA}\))Footnote 2 contains all \(\varSigma _{\textbf {Int}}\)-structures where \(A_{\textbf {Int}}= \mathbb {Z}\) and all symbols from \(\varSigma _{\textbf {Int}}\) are interpreted as usual.

If \(\textbf{A}\) is a \(\varSigma \)-structure and \(\varSigma '\) is a subsignature of \(\varSigma \), then the reduct of \(\textbf{A}\) to \(\varSigma '\) is the unique \(\varSigma '\)-structure that interprets its function symbols like \(\textbf{A}\). So the theory linear integer arithmetic (\(\text {LIA}\)) consists of the reducts of all elements of \(\text {NIA}\) to \(\varSigma _{\textbf {Int}}\setminus \{\cdot ,\mathrel {\textsf {div}},\mathrel {\textsf {mod}}\}\).

Given a \(\varSigma \)-structure \(\textbf{A}\) and a \(\varSigma \)-term t, the meaning of t results from interpreting all function symbols according to \(\textbf{A}\). For function symbols f whose interpretation is fixed by a \(\varSigma \)-theory \(\mathcal {T}\), we denote f’s interpretation by . Given a \(\varSigma \)-theory \(\mathcal {T}\), a \(\varSigma \)-formula \(\varphi \) (i.e., a \(\varSigma \)-term of type \({\textbf {Bool}}\)) is satisfiable in \(\mathcal {T}\) if there is an \(\textbf{A} \in \mathcal {T}\) such that . Then \(\textbf{A}\) is called a model of \(\varphi \), written \(\textbf{A}\,\models \,\varphi \). If every \(\textbf{A} \in \mathcal {T}\) is a model of \(\varphi \), then \(\varphi \) is \(\mathcal {T}\)-valid, written \(\models _\mathcal {T}\varphi \). We write \(\psi \equiv _\mathcal {T}\varphi \) for \(\models _\mathcal {T}\psi \iff \varphi \).

We sometimes also consider uninterpreted functions. Then the signature may not only contain the function symbols of the theory under consideration and variables, but also additional non-constant function symbols.

We write “term”, “structure”, “theory”, \(\ldots \) instead of “\(\varSigma \)-term”, “\(\varSigma \)-structure”, “\(\varSigma \)-theory”, \(\ldots \) if \(\varSigma \) is irrelevant or clear from the context. Similarly, we just write “\(\equiv \)” and “valid” instead of “\(\equiv _\mathcal {T}\)” and “\(\mathcal {T}\)-valid” if \(\mathcal {T}\) is clear from the context. Moreover, we use unary minus and \(t^c\) (where t is a term of sort \({\textbf {Int}}\) and \(c \in \mathbb {N}\)) as syntactic sugar, and we use infix notation for binary function symbols.

In the sequel, we use \(x,y,z,\ldots \) for variables, \(s,t,p,q,\ldots \) for terms of sort \({\textbf {Int}}\), \(\varphi , \psi ,\ldots \) for formulas, and \(a,b,c,d, \ldots \) for integers.

3 The SMT Theory \(\text {EIA}\)

We now introduce our novel SMT theory for exponential integer arithmetic. To this end, we define the signature \(\varSigma _{\textbf {Int}}^{\textsf {exp}}\), which extends \(\varSigma _{\textbf {Int}}\) with

$$ \textsf {exp}: {\textbf {Int}}\ {\textbf {Int}}\ {\textbf {Int}}. $$

If the \(2^{nd}\) argument of \(\textsf {exp}\) is non-negative, then its semantics is as expected, i.e., we are interested in structures \(\textbf{A}\) such that for all \(d \ge 0\). However, if the \(2^{nd}\) argument is negative, then we have to use different semantics. The reason is that we may have \(c^d \notin \mathbb {Z}\) if \(d < 0\). Intuitively, \(\textsf {exp}\) should be a partial function, but all functions are total in SMT-LIB logic. We solve this problem by interpreting \(\textsf {exp}(c,d)\) as \(c^{|d|}\). This semantics has previously been used in the literature, and the resulting logic admits a known decidable fragment [5].

Definition 1

(EIA). The theory exponential integer arithmetic (EIA) contains all \(\varSigma _{\textbf {Int}}^{\textsf {exp}}\)-structures \(\textbf{A}\) with whose reduct to \(\varSigma _{\textbf {Int}}\) is in \(\text {NIA} \).

Alternatively, one could treat \(\textsf {exp}(c,d)\) like an uninterpreted function if d is negative. Doing so would be analogous to the treatment of division by zero in SMT-LIB logic. Then, e.g., \(\textsf {exp}(0,-1) \ne \textsf {exp}(0,-2)\) would be satisfied by a structure \(\textbf{A}\) with if \(d \ge 0\) and , otherwise. However, the drawback of this approach is that important laws of exponentiation like

$$ \textsf {exp}(\textsf {exp}(x,y), z) = \textsf {exp}(x,y \cdot z) $$

would not be valid. Thus, we focus on the semantics from Definition 1.

4 Solving \(\text {EIA}\) Problems via CEGAR

figure i

We now explain our technique for solving \(\text {EIA}\) problems, see Algorithm 1. Our goal is to (dis)prove satisfiability of \(\varphi \) in \(\text {EIA} \). The loop in Line 6 is a CEGAR loop which lifts an SMT solver for \(\text {NIA}\) (which is called in Line 6) to \(\text {EIA}\). So the abstraction consists of using \(\text {NIA}\)- instead of \(\text {EIA}\)-models. Hence, \(\textsf {exp}\) is considered to be an uninterpreted function in Line 6, i.e., the SMT solver also searches for an interpretation of \(\textsf {exp}\). If the model found by the SMT solver is a counterexample (i.e., if conflicts with ), then the formula under consideration is refined by adding suitable lemmas in Lines 9–11 and the loop is iterated again.

Definition 2

(Counterexample). We call a \(\text {NIA}\)-model \(\textbf{A}\) of \(\varphi \) a counterexample if there is a subterm \(\textsf {exp}(s,t)\) of \(\varphi \) such that .

In the sequel, we first discuss our preprocessings (first loop in Algorithm 1) in Sect. 4.1. Then we explain our refinement (Lines 9–11) in Sect. 4.2. Here, we first introduce the different kinds of lemmas that are used by our implementation in Sect. 4.2.14.2.4. If implemented naively, the number of lemmas can get quite large, so we explain how to generate lemmas lazily in Sect. 4.2.5. Finally, we conclude this section by stating important properties of Algorithm 1.

Example 3

(Leading Example). To illustrate our approach, we show how to prove

$$ \forall x,y.\ |x| > 2 \wedge |y| > 2 \implies \textsf {exp}(\textsf {exp}(x,y),y) \ne \textsf {exp}(x,\textsf {exp}(y,y)) $$

by encoding absolute values suitablyFootnote 3 and proving unsatisfiability of its negation:

$$ x^2 > 4 \wedge y^2 > 4 \wedge \textsf {exp}(\textsf {exp}(x,y),y) = \textsf {exp}(x,\textsf {exp}(y,y)) $$

4.1 Preprocessings

In the first loop of Algorithm 1, we preprocess \(\varphi \) by alternating constant folding (Line 3) and rewriting (Line 4) until a fixpoint is reached. Constant folding evaluates subexpressions without variables, where subexpressions \(\textsf {exp}(c,d)\) are evaluated to \(c^{|d|}\), i.e., according to the semantics of \(\text {EIA}\). Rewriting reduces the number of occurrences of \(\textsf {exp}\) via the following (terminating) rewrite rules:

$$\begin{aligned} \textsf {exp}(x,c) & {} \rightarrow x^{|c|} \qquad \qquad \qquad \text {if } c \in \mathbb {Z}\\ \textsf {exp}(\textsf {exp}(x,y),z) & {} \rightarrow \textsf {exp}(x, y \cdot z) \\ \textsf {exp}(x,y) \cdot \textsf {exp}(z,y) & {} \rightarrow \textsf {exp}(x \cdot z, y) \end{aligned}$$

In particular, the \(1^{st}\) rule allows us to rewriteFootnote 4 \(\textsf {exp}(s,0)\) to \(s^0 = 1\) and \(\textsf {exp}(s,1)\) to \(s^1 = s\). Note that the rule

$$ \textsf {exp}(x,y) \cdot \textsf {exp}(x,z) \rightarrow \textsf {exp}(x,y+z) $$

would be unsound, as the right-hand side would need to be \(\textsf {exp}(x,|y|+|z|)\) instead. We leave the question whether such a rule is beneficial to future work.

Example 4

(Preprocessing). For our leading example, applying the \(2^{nd}\) rewrite rule at the underlined position yields:

$$\begin{aligned} & x^2 > 4 \wedge y^2 > 4 \wedge \underline{\textsf {exp}(\textsf {exp}(x,y),y)} = \textsf {exp}(x,\textsf {exp}(y,y)) \nonumber \\ {} \rightarrow {} & x^2 > 4 \wedge y^2 > 4 \wedge \textsf {exp}(x,y^2) = \textsf {exp}(x,\textsf {exp}(y,y)) \end{aligned}$$
(2)

Lemma 5

We have \(\varphi \equiv _{ {\text {EIA}}} \textsc {FoldConstants}(\varphi )\) and \(\varphi \equiv _{ {\text {EIA}}} \textsc {Rewrite}(\varphi )\).

4.2 Refinement

Our refinement (Lines 9–11 of Algorithm 1) is based on the four kinds of lemmas named in Line 9: symmetry lemmas, monotonicity lemmas, bounding lemmas, and interpolation lemmas. In the sequel, we explain how we compute a set \(\mathcal {L}\) of such lemmas. Then our refinement conjoins

$$ \{\psi \in \mathcal {L}\mid \textbf{A}\,\not \models \,\psi \} $$

to \(\varphi \) in Line 11. As our lemmas allow us to eliminate any counterexample, this set is never empty, see Theorem 23. To compute \(\mathcal {L}\), we consider all terms that are relevant for the formula \(\varphi \).

Definition 6

(Relevant Terms). A term \( {\textsf {exp}}(s,t)\) is relevant if \(\varphi \) has a subterm of the form \( {\textsf {exp}}(\pm s,\pm t)\).

Example 7

(Relevant Terms). For our leading example (2), the relevant terms are all terms of the form \(\textsf {exp}(\pm x, \pm y^2)\), \(\textsf {exp}(\pm y, \pm y)\), or \(\textsf {exp}(\pm x, \pm \textsf {exp}(y,y))\).

While the formula \(\varphi \) is changed in Line 11 of Algorithm 1, we only conjoin new lemmas to \(\varphi \), and thus, relevant terms can never become irrelevant. Moreover, by construction our lemmas only contain \(\textsf {exp}\)-terms that were already relevant before. Thus, the set of relevant terms is not changed by our CEGAR loop.

As mentioned in Sect. 1, our approach may also compute lemmas with non-linear polynomial arithmetic. However, our lemmas are linear if s is an integer constant and t is linear for all subterms \(\textsf {exp}(s,t)\) of \(\varphi \). Here, despite the fact that the function “\(\mathrel {\textsf {mod}}\)” is not contained in the signature of \(\text {LIA}\), we also consider literals of the form \(s \mathrel {\textsf {mod}}c = 0\) where \(c \in \mathbb {N}_+ = \mathbb {N}\setminus \{0\}\) as linear. The reason is that, according to the SMT-LIB standard, \(\text {LIA}\) contains a functionFootnote 5\(\textsf{divisible}_c\ {\textbf {Int}}\ {\textbf {Bool}}\)” for each \(c \in \mathbb {N}_+\), which yields \({\textbf {true}} \) iff its argument is divisible by c, and hence we have \(s \mathrel {\textsf {mod}}c = 0\) iff \(\textsf{divisible}_c(s)\).

In the sequel, means , where \(\textbf{A}\) is the model from Line 6 of Algorithm 1.

4.2.1 Symmetry Lemmas

Symmetry lemmas encode the relation between terms of the form \(\textsf {exp}(\pm s,\pm t)\). For each relevant term \(\textsf {exp}(s,t)\), the set \(\mathcal {L}\) contains the following symmetry lemmas:

figure p

Note that \(\textrm{SYM}_{1}\) and \(\textrm{SYM}_{2}\) are just implications, not equivalences, as, for example, \(c^{|d|} = (-c)^{|d|}\) does not imply \(d \mathrel {\textsf {mod}}2 = 0\) if \(c = 0\).

Example 8

(Symmetry Lemmas). For our leading example (2), the following symmetry lemmas would be considered, among others:

$$\begin{aligned} \textrm{SYM}_{1}: {} & {} -y \mathrel {\textsf {mod}}2 = 0 \implies {} & \textsf {exp}(-y,-y) = \textsf {exp}(y,-y) \end{aligned}$$
(3)
$$\begin{aligned} \textrm{SYM}_{2}: {} & {} -y \mathrel {\textsf {mod}}2 = 1 \implies {} & \textsf {exp}(-y,-y) = -\textsf {exp}(y,-y) \end{aligned}$$
(4)
$$\begin{aligned} \textrm{SYM}_{3}: {} & {} {}& \textsf {exp}(x, \textsf {exp}(y,y)) = \textsf {exp}(x, -\textsf {exp}(y,y)) \end{aligned}$$
(5)
$$\begin{aligned} \textrm{SYM}_{4}: {} & {} {}& \textsf {exp}(y,y) = \textsf {exp}(y,-y) \end{aligned}$$
(6)

Note that, e.g., (3) results from the term \(\textsf {exp}(-y,-y)\), which is relevant (see Definition 6) even though it does not occur in \(\varphi \).

To show soundness of our refinement, we have to show that our lemmas are \(\text {EIA} \)-valid.

Lemma 9

Let st be terms of sort \({\textbf {Int}}\). Then \(\textrm{SYM}_{1}\)\(\textrm{SYM}_{3}\) are \(\text {EIA}\)-valid.

4.2.2 Monotonicity Lemmas

Monotonicity lemmas are of the form

$$\begin{aligned} s_2 \ge s_1 > 1 \wedge t_2 \ge t_1 > 0 \wedge (s_2 > s_1 \vee t_2 > t_1) \implies \textsf {exp}(s_2,t_2) > \textsf {exp}(s_1,t_1), \end{aligned}$$
(mon)

i.e., they prohibit violations of monotonicity of \(\textsf {exp}\).

Example 10

(Monotonicity Lemmas). For our leading example (2), we obtain, e.g., the following lemmas:

$$\begin{aligned} x > 1 \wedge \textsf {exp}(y,y) > y^2 > 0 \implies {} & \textsf {exp}(x, \textsf {exp}(y,y)) > \textsf {exp}(x, y^2) \end{aligned}$$
(7)
$$\begin{aligned} x > 1 \wedge -\textsf {exp}(y,y) > y^2 > 0 \implies {} & \textsf {exp}(x, -\textsf {exp}(y,y)) > \textsf {exp}(x, y^2) \end{aligned}$$
(8)

So for each pair of two different relevant terms \(\textsf {exp}(s_1,t_1), \textsf {exp}(s_2,t_2)\) where and , the set \(\mathcal {L}\) contains mon.

Lemma 11

Let \(s_1,s_2,t_1,t_2\) be terms of sort \({\textbf {Int}}\). Then mon is \(\text {EIA}\)-valid.

4.2.3 Bounding Lemmas

Bounding lemmas provide bounds on relevant terms \(\textsf {exp}(s,t)\) where and are non-negative. Together with symmetry lemmas, they also give rise to bounds for the cases where s or t are negative.

For each relevant term \(\textsf {exp}(s,t)\) where and are non-negative, the following lemmas are contained in \(\mathcal {L}\):

figure w

The cases \(t \in \{0,1\}\) are also addressed by our first rewrite rule (see Sect. 4.1). However, this rewrite rule only applies if t is an integer constant. In contrast, the first two lemmas above apply if t evaluates to 0 or 1 in the current model.

Example 12

(Bounding Lemmas). For our leading example (2), the following bounding lemmas would be considered, among others:

$$\begin{aligned} \textrm{BND}_1: &\,& \textsf {exp}(y,y) = 0 \implies {} & \textsf {exp}(x,\textsf {exp}(y,y)) = 1 \nonumber \\ \textrm{BND}_2: &\,& \textsf {exp}(y,y) = 1 \implies {} & \textsf {exp}(x,\textsf {exp}(y,y)) = x \nonumber \\ \textrm{BND}_3: &\,& x = 0 \wedge \textsf {exp}(y,y) \ne 0 \iff {} & \textsf {exp}(x,\textsf {exp}(y,y)) = 0 \nonumber \\ \textrm{BND}_4: &\,& x = 1 \implies {} & \textsf {exp}(x,\textsf {exp}(y,y)) = 1 \nonumber \\ \textrm{BND}_5: &\,& y > 2 \implies {} & \textsf {exp}(y,y) > y^2 + 1 \end{aligned}$$
(9)
$$\begin{aligned} \textrm{BND}_5: &\,& -y > 2 \implies {} & \textsf {exp}(-y,-y) > y^2 + 1 \end{aligned}$$
(10)

Lemma 13

Let st be terms of sort \({\textbf {Int}}\). Then \(\textrm{BND}_{1}\)\(\textrm{BND}_{5}\) are \(\text {EIA}\)-valid.

The bounding lemmas are defined in such a way that they provide lower bounds for \(\textsf {exp}(s,t)\) for almost all non-negative values of s and t. The reason why we focus on lower bounds is that polynomials can only bound \(\textsf {exp}(s,t)\) from above for finitely many values of s and t. The missing (lower and upper) bounds are provided by interpolation lemmas.

4.2.4 Interpolation Lemmas

In addition to bounding lemmas, we use interpolation lemmas that are constructed via bilinear interpolation to provide bounds. Here, we assume that the arguments of \(\textsf {exp}\) are positive, as negative arguments are handled by symmetry lemmas, and bounding lemmas yield tight bounds if at least one argument of \(\textsf {exp}\) is 0. The correctness of interpolation lemmas relies on the following observation.

Lemma 14

Let \(f: \mathbb {R}_+ \rightarrow \mathbb {R}_+\) be convex, \(w_1,w_2 \in \mathbb {R}_+\), and \(w_1 < w_2\). Then

$$\begin{aligned} \forall x \in [w_1,w_2].&\ f(x) \le f(w_1) + \frac{f(w_2)-f(w_1)}{w_2-w_1} \cdot (x - w_1) & \text {and} \\ \forall x \in \mathbb {R}_+ \setminus (w_1,w_2).&\ f(x) \ge f(w_1) + \frac{f(w_2)-f(w_1)}{w_2-w_1} \cdot (x - w_1). \end{aligned}$$

Here, \([w_1,w_2]\) and \((w_1,w_2)\) denote closed and open real intervals. Note that the right-hand side of the inequations above is the linear interpolant of f between \(w_1\) and \(w_2\). Intuitively, it corresponds to the secant of f between the points \((w_1,f(w_1))\) and \((w_2,f(w_2))\), and thus the lemma follows from convexity of f.

Let \(\textsf {exp}(s,t)\) be relevant, , , and , i.e., we want to prohibit the current interpretation of \(\textsf {exp}(s,t)\).

Interpolation Lemmas for Upper Bounds. First assume , i.e., to rule out this counterexample, we need a lemma that provides a suitable upper bound for \(\textsf {exp}(c,d)\). Let \(c',d' \in \mathbb {N}_+\) and:

$$\begin{aligned} c^- & {} := \min (c,c') & c^+ & {} := \max (c,c') & d^- & {} := \min (d,d') & d^+ & {} := \max (d,d') \\ {} & {} [c^\pm ] & {} := [c^- .. \ c^+] & [d^\pm ] & {} := [d^- .. \ d^+] \end{aligned}$$

Here, \([a .. \ b]\) denotes a closed integer interval. Then we first use \(d^-,d^+\) for linear interpolation w.r.t. the \(2^{nd}\) argument of \(\lambda x,y.\ x^y\). To this end, let

figure ab

where we define if \(b \ne 0\) and . So if \(d^- < d^+\), then \(\textrm{ip}_{2}^{[{d}^\pm ]}(x,y)\) corresponds to the linear interpolant of \(x^y\) w.r.t. y between \(d^-\) and \(d^+\). Then \(\textrm{ip}_{2}^{[{d}^\pm ]}(x,y)\) is a suitable upper bound, as

$$\begin{aligned} \forall x \in \mathbb {N}_+, y \in [d^\pm ].\ x^y \le \textrm{ip}_{2}^{[{d}^\pm ]}(x,y) \end{aligned}$$
(11)

follows from Lemma 14. Hence, we could derive the following \(\text {EIA}\)-valid lemma:Footnote 6

figure ae

Example 15

(Linear Interpolation w.r.t. y). Let , i.e., we have \(c = 3\) and \(d = 9\). Moreover, assume \(c' = d' = 1\), i.e., we get \(c^- = 1\), \(c^+ = 3\), \(d^- = 1\), and \(d^+ = 9\). Then

$$ \textrm{ip}_{2}^{[{d}^\pm ]}(x,y) = \textrm{ip}_{2}^{[{1} .. {9}]}(x,y) = x^{1} + \frac{x^{9} - x^{1}}{9 - 1} \cdot (y - 1) = x + \frac{x^9-x}{8} \cdot (y-1). $$

Hence, \(\textrm{IP}_{1}\) corresponds to

$$\begin{aligned} s > 0 \wedge t \in [1,9] \implies \textsf {exp}(s,t) \le s + \frac{s^9-s}{8} \cdot (t-1). \end{aligned}$$

This lemma would be violated by our counterexample, as we have

figure ag

However, the degree of \(\textrm{ip}_{2}^{[{d}^\pm ]}(s,t)\) depends on \(d^+\), which in turn depends on the model that was found by the underlying SMT solver. Thus, the degree of \(\textrm{ip}_{2}^{[{d}^\pm ]}(s,t)\) can get very large, which is challenging for the underlying solver.

So we next use \(c^-,c^+\) for linear interpolation w.r.t. the \(1^{st}\) argument of \(\lambda x,y.\ x^y\), resulting in

figure ah

Then due to Lemma 14, \(\textrm{ip}_{1}^{[{c}^\pm ]}(x,y)\) is also an upper bound on the exponentiation function, i.e., we have

$$\begin{aligned} \forall y \in \mathbb {N}_+, x \in [c^\pm ].\ x^y \le \textrm{ip}_{1}^{[{c}^\pm ]}(x,y). \end{aligned}$$
(12)

Note that we have for all \(y \in [d^\pm ]\), and thus

figure aj

is monotonically increasing in both \(x^{d^-}\) and \(x^{d^+}\). Hence, in the definition of \(\textrm{ip}_{2}^{[{d}^\pm ]}\), we can approximate \(x^{d^-}\) and \(x^{d^+}\) with their upper bounds \(\textrm{ip}_{1}^{[{c}^\pm ]}(x,d^-)\) and \(\textrm{ip}_{1}^{[{c}^\pm ]}(x,d^+)\) that can be derived from (12). Then (11) yields

$$\begin{aligned} \forall x \in [c^\pm ], y \in [d^\pm ].\ x^y \le \textrm{ip}^{[{c}^\pm ][{d}^\pm ]}(x,y) \end{aligned}$$
(13)

where

figure ak

So the set \(\mathcal {L}\) contains the lemma

figure al

which is valid due to (13), and rules out any counterexample with , as \(\textrm{ip}^{[{c}^\pm ][{d}^\pm ]}(c,d) = c^d\).

Example 16

(Bilinear Interpolation, Example 15 continued). In our example, we have:

$$\begin{aligned} \textrm{ip}_{1}^{[{c}^\pm ]}(x,y) & {} = \textrm{ip}_{1}^{[{1} .. {3}]}(x,y) = 1^y + \frac{3^y - 1^y}{3 - 1} \cdot (x - 1) = 1 + \frac{3^y-1}{2} \cdot (x-1) \\ \textrm{ip}_{1}^{[{c}^\pm ]}(s,d^-) & {} = \textrm{ip}_{1}^{[{1} .. {3}]}(s,1) = 1 + \frac{3-1}{2} \cdot (s-1) = s \\ \textrm{ip}_{1}^{[{c}^\pm ]}(s,d^+) & {} = \textrm{ip}_{1}^{[{1} .. {3}]}(s,9) = 1 + \frac{3^9-1}{2} \cdot (s-1) = 1 + 9841 \cdot (s-1) \end{aligned}$$

Hence, we obtain the lemma

$$ s \in [1,3] \wedge t \in [1,9] \implies \textsf {exp}(s,t) \le s + \frac{1 + 9841 \cdot (s - 1) -s}{8} \cdot (t-1). $$

This lemma is violated by our counterexample, as we have

figure an

\(\textrm{IP}_{2}\) relates \(\textsf {exp}(s,t)\) with the bilinear function \(\textrm{ip}^{[{c}^\pm ][{d}^\pm ]}(s,t)\), i.e., this function is linear w.r.t. both s and t, but it multiplies s and t. Thus, if s is an integer constant and t is linear, then the resulting lemma is linear, too.

To compute interpolation lemmas, a second point \((c',d')\) is needed. In our implementation, we store all points (cd) where interpolation has previously been applied and use the one which is closest to the current one. The same heuristic is used to compute secant lemmas in [11]. For the \(1^{st}\) interpolation step, we use \((c',d') = (c,d)\). In this case, \(\textrm{IP}_{2}\) simplifies to \(s = c \wedge t = d \implies \textsf {exp}(s,t) \le c^d\).

Lemma 17

Let \(c^+ \ge c^- > 0\) and \(d^+ \ge d^- > 0\). Then \(\textrm{IP}_{2}\) is \(\text {EIA}\)-valid.

Interpolation Lemmas for Lower Bounds. While bounding lemmas already yield lower bounds, the bounds provided by \(\textrm{BND}_{5}\) are not exact, in general. Hence, if , then we also use bilinear interpolation to obtain a precise lower bound for \(\textsf {exp}(c,d)\). Dually to (11) and (12), Lemma 14 implies:

figure ap

Additionally, we also obtain

$$\begin{aligned} \forall x,y \in \mathbb {N}_+.\ x^{y+1} - x^y \ge \textrm{ip}_{1}^{[{c} .. {c+1}]}(x,y+1)-\textrm{ip}_{1}^{[{c} .. {c+1}]}(x,y) \end{aligned}$$
(16)

from Lemma 14. The reason is that for \(f(x) := x^{y+1} - x^y\), the right-hand side of (16) is equal to the linear interpolant of f between c and \(c+1\). Moreover, f is convex, as \(f(x) = x^y \cdot (x-1)\) where for any fixed \(y \in \mathbb {N}_+\), both \(x^y\) and \(x-1\) are non-negative, monotonically increasing, and convex on \(\mathbb {R}_+\).

If \(y \ge d\), then \(\textrm{ip}_{2}^{[{d} .. {d+1}]}(x,y) = x^{d} + (x^{d+1} - x^{d}) \cdot (y - d)\) is monotonically increasing in the first occurrence of \(x^{d}\), and in \(x^{d+1} - x^{d}\). Thus, by approximating \(x^{d}\) and \(x^{d+1} - x^{d}\) with their lower bounds from (15) and (16), (14) yields

$$\begin{aligned} \nonumber \forall x \in \mathbb {N}_+, y \ge d.\ x^y & \ge \textrm{ip}_{1}^{[{c} .. {c+1}]}(x,d) + (\textrm{ip}_{1}^{[{c} .. {c+1}]}(x,d+1) - \textrm{ip}_{1}^{[{c} .. {c+1}]}(x,d)) \cdot (y-d)\\ &= \textrm{ip}^{[{c} .. {c+1}][{d} .. {d+1}]}(x,y). \end{aligned}$$
(17)

So dually to \(\textrm{IP}_{2}\), the set \(\mathcal {L}\) contains the lemma

figure aq

which is valid due to (17) and rules out any counterexample with , as \(\textrm{ip}^{[{c} .. {c+1}][{d} .. {d+1}]}(c,d) = c^d\).

Example 18

(Interpolation, Lower Bounds). Let , i.e., we have \(c = 3\), and \(d = 9\). Then

$$\begin{aligned} \textrm{ip}_{1}^{[{3} .. {4}]}(x,9) & {} = 3^9 + (4^9-3^9) \cdot (x - 3) {} = 19683 + 242461 \cdot (x-3) \\ \textrm{ip}_{1}^{[{3} .. {4}]}(x,10) & {} = 3^{10} + (4^{10}-3^{10}) \cdot (x - 3) {} = 59049 + 989527 \cdot (x - 3)\\ \textrm{ip}^{[{3} .. {4}][{9} .. {10}]}(x,y) & {} = \textrm{ip}_{1}^{[{3} .. {4}]}(x,9) + ( \textrm{ip}_{1}^{[{3} .. {4}]}(x,10) - \textrm{ip}_{1}^{[{3} .. {4}]}(x,9)) \cdot (y-9) \end{aligned}$$

and thus we obtain the lemma

$$ s \ge 1 \wedge t \ge 9 \implies \textsf {exp}(s,t) \ge 747066 \cdot s \cdot t - 6481133 \cdot s- 2201832 \cdot t + 19108788. $$

It is violated by our counterexample, as we have

figure at

Lemma 19

Let \(c,d \in \mathbb {N}_+\). Then \(\textrm{IP}_{3}\) is \(\text {EIA}\)-valid.

4.2.5 Lazy Lemma Generation

In practice, it is not necessary to compute the entire set of lemmas \(\mathcal {L}\). Instead, we can stop as soon as \(\mathcal {L}\) contains a single lemma which is violated by the current counterexample. However, such a strategy would result in a quite fragile implementation, as its behavior would heavily depend on the order in which lemmas are computed, which in turn depends on low-level details like the order of iteration over sets, etc. So instead, we improve Lines 9–11 of Algorithm 1 and use the following precedence on our four kinds of lemmas:

$$ \text {symmetry} \succ \text {monotonicity} \succ \text {bounding} \succ \text {interpolation} $$

Then we compute all lemmas of the same kind, starting with symmetry lemmas, and we only proceed with the next kind if none of the lemmas computed so far is violated by the current counterexample. The motivation for the order above is as follows: Symmetry lemmas obtain the highest precedence, as other kinds of lemmas depend on them for restricting \(\textsf {exp}(s,t)\) in the case that s or t is negative. As the coefficients in interpolation lemmas for \(\textsf {exp}(s,t)\) grow exponentially w.r.t. (see, e.g., Example 18), interpolation lemmas get the lowest precedence. Finally, we prefer monotonicity lemmas over bounding lemmas, as monotonicity lemmas are linear (if the arguments of \(\textsf {exp}\) are linear), whereas \(\textrm{BND}_{5}\) may be non-linear.

Example 20

(Leading Example Finished). We now finish our leading example which, after preprocessing, looks as follows (see Example 4):

figure av

Then our implementation generates 12 symmetry lemmas, 4 monotonicity lemmas, and 8 bounding lemmas before proving unsatisfiability, including

$$ (3), (4), (5), (6), (7), (8), (9), \text { and } (10). $$

These lemmas suffice to prove unsatisfiability for the case \(x > 2\) (the cases \(x \in [-2 .. \ 2]\) or \(y \in [-2 .. \ 2]\) are trivial). For example, if \(y < -2\) and \(-y \mathrel {\textsf {mod}}2 = 0\), we get

$$\begin{aligned} y < -2 & {}\overset{(10)}{\curvearrowright }\ \textsf {exp}(-y,-y) > y^2 + 1 \overset{(3)}{\curvearrowright }\ \textsf {exp}(y,-y) > y^2 + 1 \\ & \overset{(6)}{\curvearrowright }\ \textsf {exp}(y,y) > y^2 + 1 \overset{(7)}{\curvearrowright }\textsf {exp}(x, \textsf {exp}(y,y)) > \textsf {exp}(x, y^2) \overset{(2)}{\curvearrowright }\ {\textbf {false}} \end{aligned}$$

and for the cases \(y>2\) and \(y < -2 \wedge -y \mathrel {\textsf {mod}}2 = 1\), unsatisfiability can be shown similarly. For the case \(x < -2\), 5 more symmetry lemmas, 2 more monotonicity lemmas, and 3 more bounding lemmas are used. The remaining 3 symmetry lemmas and 3 bounding lemmas are not used in the final proof of unsatisfiability.

While our leading example can be solved without interpolation lemmas, in general, interpolation lemmas are a crucial ingredient of our approach.

Example 21

Consider the formula

$$ 1 < x < y \wedge 0 < z \wedge \textsf {exp}(x,z) < \textsf {exp}(y,z). $$

Our implementation first rules out 33 counterexamples using 7 bounding lemmas and 42 interpolation lemmas in \(\sim \! 0.1\) seconds, before finding the model , , and . Recall that interpolation lemmas are only used if a counterexample cannot be ruled out by any other kinds of lemmas. So without interpolation lemmas, our implementation could not solve this example.

Our main soundness theorem follows from soundness of our preprocessings (Lemma 5) and the fact that all of our lemmas are \(\text {EIA}\)-valid (Lemmas 9, 11, 13, 17, and 19).

Theorem 22

(Soundness of Algorithm 1). If Algorithm 1 returns \({\textbf {sat}} \), then \(\varphi \) is satisfiable in \(\text {EIA} \). If Algorithm 1 returns \({\textbf {unsat}} \), then \(\varphi \) is unsatisfiable in \(\text {EIA} \).

Another important property of Algorithm 1 is that it can eliminate any counterexample, and hence it makes progress in every iteration.

Theorem 23

(Progress Theorem). If \(\textbf{A}\) is a counterexample and \(\mathcal {L}\) is computed as in Algorithm 1, then

$$ \textbf{A}\,\not \models \,\bigwedge \mathcal {L}. $$

Despite Theorems 22 and 23, \(\text {EIA}\) is of course undecidable, and hence Algorithm 1 is incomplete. For example, it does not terminate for the input formula

$$\begin{aligned} y \ne 0 \wedge \textsf {exp}(2,x) = \textsf {exp}(3,y). \end{aligned}$$
(18)

Here, to prove unsatisfiability, one needs to know that \(2^{|x|}\) is 1 or even, but \(3^{|y|}\) is odd and greater than 1 (unless \(y=0\)). This cannot be derived from the lemmas used by our approach. Thus, Algorithm 1 would refine the formula (18) infinitely often.

Note that monotonicity lemmas are important, even though they are not required to prove Theorem 23. The reason is that all (usually infinitely many) counterexamples must be eliminated to prove \({\textbf {unsat}} \). For instance, reconsider Example 20, where the monotonicity lemma (7) eliminates infinitely many counterexamples with . In contrast, Theorem 23 only guarantees that every single counterexample can be eliminated. Consequently, our implementation does not terminate on our leading example if monotonicity lemmas are disabled.

5 Related Work

The most closely related work applies incremental linearization to \(\text {NIA}\), or to non-linear real arithmetic with transcendental functions (\(\text {NRAT}\)). Like our approach, incremental linearization is an instance of the CEGAR paradigm: An initial abstraction (where certain predefined functions are considered as uninterpreted functions) is refined via linear lemmas that rule out the current counterexample.

Our approach is inspired by, but differs significantly from the approach for linearization of \(\text {NRAT}\) from [11]. There, non-linear polynomials are linearized as well, whereas we leave the handling of polynomials to the backend solver. Moreover, [11] uses linear lemmas only, whereas we also use bilinear lemmas. Furthermore, [11] fixes the base to Euler’s number e, whereas we consider a binary version of exponentiation.

The only lemmas that easily carry over from [11] are monotonicity lemmas. While [11] also uses symmetry lemmas, they express properties of the sine function, i.e., they are fundamentally different from ours. Our bounding lemmas are related to the “lower bound” and “zero” lemmas from [11], but there, \(\lambda x.\ e^x\) is trivially bounded by 0. Interpolation lemmas are related to the “tangent” and “secant lemmas” from [11]. However, tangent lemmas make use of first derivatives, so they are not expressible with integer arithmetic in our setting, as we have \(\frac{\partial }{\partial y} x^y = x^y \cdot \ln x\). Secant lemmas are essentially obtained by linear interpolation, so our interpolation lemmas can be seen as a generalization of secant lemmas to binary functions. A preprocessing by rewriting is not considered in [11].

In [10], incremental linearization is applied to \(\text {NIA}\). The lemmas that are used in [10] are similar to those from [11], so they differ fundamentally from ours, too.

Further existing approaches for \(\text {NRAT}\) are based on interval propagation [14, 24]. As observed in [11], interval propagation effectively computes a piecewise constant approximation, which is less expressive than our bilinear approximations.

Recently, a novel approach for \(\text {NRAT}\) based on the topological degree test has been proposed [12, 30]. Its strength is finding irrational solutions more often than other approaches for \(\text {NRAT}\). Hence, this line of work is orthogonal to ours.

\(\text {EIA}\) could also be tackled by combining \(\text {NRAT}\) techniques with branch-and-bound, but the following example shows that doing so is not promising.

Example 24

Consider the formula \(x = \textsf {exp}(3, y) \wedge y > 0\). To tackle it with existing solvers, we have to encode it using the natural exponential function:

$$\begin{aligned} e^z = 3 \wedge x = e^{y \cdot z} \wedge y > 0 \end{aligned}$$
(19)

Here x and y range over the integers and z ranges over the reals. Any model of (19) satisfies \(z = \ln 3\), where \(\ln 3\) is irrational. As finding such models is challenging, the leading tools MathSat [9] and CVC5 [2] fail for \(e^z = 3\).

MetiTarski [1] integrates decision procedures for real closed fields and approximations for transcendental functions into the theorem prover Metis [27] to prove theorems about the reals. In a related line of work, iSAT3 [14] has been coupled with SPASS [35]. Clearly, these approaches differ fundamentally from ours.

Recently, the complexity of a decidable extension of linear integer arithmetic with exponentiation has been investigated [5]. It is equivalent to \(\text {EIA}\) without the functions “\(\cdot \)”, “\(\mathrel {\textsf {div}}\)”, and “\(\mathrel {\textsf {mod}}\)”, and where the first argument of all occurrences of \(\textsf {exp}\) must be the same constant. Integrating decision procedures for fragments like this one into our approach is an interesting direction for future work.

6 Implementation and Evaluation

Implementation. We implemented our approach in our novel tool SwInE. It is based on SMT-Switch [31], a library that offers a unified interface for various SMT solvers. SwInE uses the backend solvers Z3 4.12.2 [32] and CVC5 1.0.8 [2]. It supports incrementality and can compute models for variables, but not yet for uninterpreted functions, due to limitations inherited from SMT-Switch.

The backend solver (which defaults to Z3) can be selected via command-line flags. For more information on SwInE and a precompiled release, we refer to [22, 23].

Benchmarks. To evaluate our approach, we synthesized a large collection of \(\text {EIA}\) problems from verification benchmarks for safety, termination, and complexity analysis. More precisely, we ran our verification tool LoAT [18] on the benchmarks for linear Constrained Horn Clauses (CHCs) with linear integer arithmetic from the CHC Competitions 2022 and 2023 [8] as well as on the benchmarks for Termination and Complexity of Integer Transition Systems from the Termination Problems Database (TPDB) [34], the benchmark set of the Termination and Complexity Competition [25], and extracted all SMT problems with exponentiation that LoAT created while analyzing these benchmarks. Afterwards, we removed duplicates.

The resulting benchmark set consists of 4627 SMT problems, which are available at [22]:

  • 669 problems that resulted from the benchmarks of the CHC Competition ’22 (called CHC Comp ’22 Problems below)

  • 158 problems that resulted from the benchmarks of the CHC Competition ’23 (CHC Comp ’23 Problems)

  • 3146 problems that resulted from the complexity benchmarks of the TPDB (Complexity Problems)

  • 654 problems that resulted from the termination benchmarks of the TPDB (Termination Problems)

Evaluation. We ran SwInE with both supported backend solvers (Z3 and CVC5). To evaluate the impact of the different components of our approach, we also tested with configurations where we disabled rewriting, symmetry lemmas, bounding lemmas, interpolation lemmas, or monotonicity lemmas. All experiments were performed on StarExec [33] with a wall clock timeout of 10s and a memory limit of 128GB per example. We chose a small timeout, as LoAT usually has to discharge many SMT problems to solve a single verification task. So in our setting, each individual SMT problem should be solved quickly.

Fig. 1.
figure 1

CHC Comp ’22 – Runtime

Table 1. CHC Comp ’22 – Results
Fig. 2.
figure 2

CHC Comp ’23 – Runtime

Table 2. CHC Comp ’23 – Results
Fig. 3.
figure 3

Complexity – Runtime

Table 3. Complexity – Results
Fig. 4.
figure 4

Termination – Runtime

Table 4. Termination – Results

The results can be seen in Tables 1, 2, 3, and 4, where VB means “virtual best”. All but 48 of the 4627 benchmarks can be solved, and all unsolved benchmarks are Complexity Problems. All CHC Comp Problems can be solved with both backend solvers. Considering Complexity and Termination Problems, Z3 and CVC5 perform almost equally well on unsatisfiable instances, but Z3 solves more satisfiable instances.

Regarding the different components of our approach, our evaluation shows that the impact of rewriting is quite significant. For example, it enables Z3 to solve 81 additional Complexity Problems. Symmetry lemmas enable Z3 to solve more Complexity Problems, but they are less helpful for CVC5. In fact, symmetry lemmas are needed for most of the examples where Z3 succeeds but CVC5 fails, so they seem to be challenging for CVC5, presumably due to the use of “\(\mathrel {\textsf {mod}}\)”. Bounding and interpolation lemmas are crucial for proving satisfiability. In particular, disabling interpolation lemmas harms more than disabling any other feature, which shows their importance. For example, Z3 can only prove satisfiability of 3 CHC Comp Problems without interpolation lemmas.

Interestingly, only CVC5 benefits from monotonicity lemmas, which enable it to solve more Complexity Problems. From our experience, CVC5 explores the search space in a more systematic way than Z3, so that subsequent candidate models often have a similar structure. Then monotonicity lemmas can help CVC5 to find structurally different candidate models.

Remarkably, disabling a single component does not reduce the number of \({\textbf {unsat}} \)’s significantly. Thus, we also evaluated configurations where all components were disabled, so that \(\textsf {exp}\) is just an uninterpreted function. This reduces the number of \({\textbf {sat}} \) results dramatically, but most \({\textbf {unsat}} \) instances can still be solved. Hence, most of them do not require reasoning about exponentials, so it would be interesting to obtain instances where proving \({\textbf {unsat}} \) is more challenging.

The runtime of SwInE can be seen in Figs. 1, 2, 3, and 4. Most instances can be solved in a fraction of a second, as desired for our use case. Moreover, CVC5 can solve more instances in the first half second, but Z3 can solve more instances later on. We refer to [22] for more details on our evaluation.

Validation. We implemented sanity checks for both \({\textbf {sat}} \) and \({\textbf {unsat}} \) results. For \({\textbf {sat}} \), we evaluate the input problem using \(\text {EIA}\) semantics for \(\textsf {exp}\), and the current model for all variables. For \({\textbf {unsat}} \), assume that the input problem \(\varphi \) contains the subterms \(\textsf {exp}(s_0,t_0),\ldots ,\textsf {exp}(s_n,t_n)\). Then we enumerate all SMT problems

$$ \textstyle \varphi \wedge \bigwedge _{i = 0}^n t_i = c_i \wedge \textsf {exp}(s_i,t_i) = s_i^{c_i} \qquad \text {where } c_1,\ldots ,c_n \in [0 .. \ k] \text { for some } k \in \mathbb {N}$$

(we used \(k=10\)). If any of them is satisfiable in \(\text {NIA}\), then \(\varphi \) is satisfiable in \(\text {EIA}\). None of these checks revealed any problems.

7 Conclusion

We presented the novel SMT theory \(\text {EIA}\), which extends the theory non-linear integer arithmetic with integer exponentiation. Moreover, inspired by incremental linearization for similar extensions of non-linear real arithmetic, we developed a CEGAR approach to solve \(\text {EIA}\) problems. The core idea of our approach is to regard exponentiation as an uninterpreted function and to eliminate counterexamples, i.e., models that violate the semantics of exponentiation, by generating suitable lemmas. Here, the use of bilinear interpolation turned out to be crucial, both in practice (see our evaluation in Sect. 6) and in theory, as interpolation lemmas are essential for being able to eliminate any counterexample (see Theorem 23). Finally, we evaluated the implementation of our approach in our novel tool SwInE on thousands of \(\text {EIA}\) problems that were synthesized from verification tasks using our verification tool LoAT. Our evaluation shows that SwInE is highly effective for our use case, i.e., as backend for LoAT. Hence, we will couple SwInE and LoAT in future work.

With SwInE, we provide an SMT-LIB compliant open-source solver for \(\text {EIA}\) [23]. In this way, we hope to attract users with applications that give rise to challenging benchmarks, as our evaluation suggests that our benchmarks are relatively easy to solve. Moreover, we hope that other solvers with support for integer exponentiation will follow, with the ultimate goal of standardizing \(\text {EIA}\).