Unification with Abstraction and Theory Instantiation in SaturationBased Reasoning
Abstract
We make a new contribution to the field by providing a new method of using SMT solvers in saturationbased reasoning. We do this by introducing two new inference rules for reasoning with nonground clauses. The first rule utilises theory constraint solving (an SMT solver) to perform reasoning within a clause to find an instance where we can remove one or more theory literals. This utilises the power of SMT solvers for theory reasoning with nonground clauses, reasoning which is currently achieved by the addition of often prolific theory axioms. The second rule is unification with abstraction where the notion of unification is extended to introduce constraints where theory terms may not otherwise unify. This abstraction is performed lazily, as needed, to allow the superposition theorem prover to make as much progress as possible without the search space growing too quickly. Additionally, the first rule can be used to discharge the constraints introduced by the second. These rules were implemented within the Vampire theorem prover and experimental results show that they are useful for solving a considerable number of previously unsolved problems. The current implementation focuses on complete theories, in particular various versions of arithmetic.
1 Introduction
Reasoning in quantifierfree firstorder logic with theories, such as arithmetic, is hard. Reasoning with quantifiers and firstorder theories is very hard. It is undecidable in general and \(\varPi _1^1\)complete for many simple combinations, for example linear (real or integer) arithmetic and uninterpreted functions [16]. At the same time such reasoning is essential to the future success of certain application areas, such as program analysis and software verification, that rely on quantifiers to, for example, express properties of objects, inductively defined data structures, the heap and dynamic memory allocation. This paper presents a new approach to theory reasoning with quantifiers that (1) uses an SMT solver to do local theory reasoning within a clause, and (2) extends unification to avoid the need to explicitly separate theory and nontheory parts of clauses.
There are two directions of research in the area of reasoning with problems containing quantifiers and theories. The first is the extension of SMT solvers with instantiation heuristics such as Ematching [9, 12]. The second is the extension of firstorder reasoning approaches with support for theory reasoning (note that the instantiation heuristics from SMT solvers are not appropriate in this context, as discussed in [26]). There have been a number of varied attempts in this second direction with some approaches extending various calculi [2, 3, 7, 8, 13, 16, 28] or using an SMT solver to deal with the ground part of the problem [20]. This second approach includes our previous work developing AVATAR modulo theories [21], which complements the approach presented in this paper as explained later. A surprisingly effective approach to theory reasoning with firstorder theorem provers is to add theory axioms (i.e. axioms from the theory of interest). Whilst this has no hope of being complete, it can be used to prove a large number of problems of interest. However, theory axioms can be highly prolific in saturationbased proof search and often swamp the search space with irrelevant consequences of the theory [22]. This combinatorial explosion prevents theory axioms from being useful in cases where deep theory reasoning is required. This paper provides a solution that allows for a combination of these approaches i.e. the integration with an SMT solver, the use of theory axioms, and the heuristic extension of the underlying calculi.
As explained in Sect. 3, these observations lead to an instantiation rule that considers clauses to be in the form \(T \rightarrow C\), where T is the theory part, and uses an SMT solver to find a substitution \(\theta \) under which T is valid in the given theory, thus producing the instance \(C\theta \). Which, in the case where \(C=\bot \), can find general inconsistencies.
The second rule is related to the use of abstraction. By an abstraction we mean (variants of) the rule obtaining from a clase C[t], where t is a nonvariable term, a clause \(x \not \simeq t \vee C[x]\), where x is a new variable. Abstraction is implemented in several theorem provers, including the previous version of our theorem prover Vampire [18] used for experiments described in this paper.
 1.
A fully abstracted clause tends to be much longer, especially if the original clause contains deeply nested theory and nontheory symbols. Getting rid of long clauses was one of the motivations of our previous AVATAR work on clause splitting [34] (see this work for why long clauses are problematic for resolutionbased approaches). However, the long clauses produced by abstraction will share variables, reducing the impact of AVATAR.
 2.
The AVATAR modulo theories approach [21] ensures that the firstorder solver is only exploring part of the search space that is theoryconsistent in its ground part (using a SMT solver to achieve this). This is effective but relies on ground literals remaining ground, even those that mix theory and nontheory symbols. Full abstraction destroys such ground literals.
 3.
As mentioned previously, the addition of theory axioms can be effective for problems requiring shallow theory reasoning. Working with fully abstracted clauses forces us to make firstorder reasoning to treat the theory part of a clause differently. This makes it difficult to take full advantage of theory axiom reasoning.
The final reason we chose not to fully abstract clauses in our work is that the main advantage of full abstraction for us would be that it deals with the above problem, but we have a solution which we believe solves this issue in a more satisfactory way, as confirmed by our experiments described in Sect. 5.
The second idea is to perform this abstraction lazily, i.e., only where it is required to perform inference steps. As described in Sect. 4, this involves extending unifications to produce theory constraints under which two terms will unify. As we will see, these theory constraints are exactly the kind of terms that can be handled easily by the instantiation technique introduced in our first idea.
 1.
a new instantiation rule that uses an SMT solver to provide instances consistent with the underlying theory (Sect. 3),
 2.
an extension of unification that provides a mechanism to perform lazy abstraction, i.e., only abstracting as much as is needed, which results in clauses with theory constraints that can be discharged by the previous instantiation technique (Sect. 4),
 3.
 4.
an experimental evaluation that demonstrate the effectiveness of these techniques both individually and in combination with the rest of the powerful techniques implemented within Vampire (Sect. 5).
An extended version of this paper [32] contains further examples and discussion. We start our presentation by introducing the necessary background material.
2 Preliminaries and Related Work
FirstOrder Logic and Theories. We consider a manysorted firstorder logic with equality. A signature is a pair \(\varSigma = (\varXi ,\varOmega )\) where \(\varXi \) is a set of sorts and \(\varOmega \) a set of predicate and function symbols with associated argument and return sorts from \(\varXi \). Terms are of the form c, x, or \(f(t_1,\ldots ,t_n)\) where f is a function symbol of arity \(n \ge 1\), \(t_1,\ldots , t_n\) are terms, c is a zero arity function symbol (i.e. a constant) and x is a variable. We assume that all terms are wellsorted. Atoms are of the form \(p(t_1,\ldots ,t_n), q\) or \(t_1 \simeq _s t_2\) where p is a predicate symbol of arity n, \(t_1, \ldots , t_n\) are terms, q is a zero arity predicate symbol and for each sort \(s\in \varXi \), \(\simeq _s\) is the equality symbol for the sort s. We write simply \(\simeq \) when s is known from the context or irrelevant. A literal is either an atom A, in which case we call it positive, or a negation of an atom \(\lnot A\), in which case we call it negative. When L is a negative literal \(\lnot A\) and we write \(\lnot L\), we mean the positive literal A. We write negated equalities as \(t_1 \not \simeq t_2\). We write \(t[s]_p\) and \(L[s]_p\) to denote that a term s occurs in a term t (in a literal L) at a position p.
A clause is a disjunction of literals \(L_1 \vee \ldots \vee L_n\) for \(n \ge 0\). We disregard the order of literals and treat a clause as a multiset. When \(n=0\) we speak of the empty clause, which is always false. When \(n=1\) a clause is called a unit clause. Variables in clauses are considered to be universally quantified. Standard methods exist to transform an arbitrary firstorder formula into clausal form (e.g. [19] and our recent work in [25]).
A substitution is any expression \(\theta \) of the form \(\{x_1 \mapsto t_1,\ldots ,x_n \mapsto t_n\}\), where \(n \ge 0\). \(E\theta \) is the expression obtained from E by the simultaneous replacement of each \(x_i\) by \(t_i\). By an expression we mean a term, an atom, a literal, or a clause. An expression is ground if it contains no variables. An instance of E is any expression \(E\theta \) and a ground instance of E is any instance of E that is ground. A unifier of two terms, atoms or literals \(E_1\) and \(E_2\) is a substitution \(\theta \) such that \(E_1\theta =E_2\theta \). It is known that if two expressions have a unifier, then they have a socalled most general unifier.
We assume a standard notion of a (firstorder, manysorted) interpretation \(\mathcal {I}\), which assigns a nonempty domain \(\mathcal {I}_s\) to every sort \(s\in \varXi \), and maps every function symbol f to a function \(\mathcal {I}_f\) and every predicate symbol p to a relation \(\mathcal {I}_p\) on these domains so that the mapping respects sorts. We call \(\mathcal {I}_f\) the interpretation of f in \(\mathcal {I}\), and similarly for \(\mathcal {I}_p\) and \(\mathcal {I}_s\). Interpretations are also sometimes called firstorder structures. A sentence is a closed formula, i.e., with no free variables. We use the standard notions of validity and satisfiability of sentences in such interpretations. An interpretation is a model for a set of clauses if (the universal closure of) each of these clauses is true in the interpretation.
A theory \(\mathcal {T}\) is identified by a class of interpretations. A sentence is satisfiable in \(\mathcal {T}\) if it is true in at least one of these interpretations and valid if it is true in all of them. A function (or predicate) symbol f is called uninterpreted in \(\mathcal {T}\), if for every interpretation \(\mathcal {I}\) of \(\mathcal {T}\) and every interpretation \(\mathcal {I}'\) which agrees with \(\mathcal {I}\) on all symbols apart from f, \(\mathcal {I}'\) is also an interpretation of \(\mathcal {T}\). A theory is called complete if, for every sentence F of this theory, either F or \(\lnot F\) is valid in this theory. Evidently, every theory of a single interpretation is complete. We can define satisfiability and validity of arbitrary formulas in an interpretation in a standard way by treating free variables as new uninterpreted constants.
For example, the theory of integer arithmetic fixes the interpretation of a distinguished sort \(s_ int \in \varXi _ IA \) to the set of mathematical integers \(\mathbb {Z}\) and analogously assigns the usual meanings to \(\{ +, , <, >, * \} \in \varOmega _ IA \). We will mostly deal with theories in which their restriction to interpreted symbols is a complete theory, for example, integer or real linear arithmetic. In the sequel we assume that \(\mathcal {T}\) is an arbitrary but fixed theory and give definitions relative to this theory.
Abstracted Clauses. Here we discuss how a clause can be separated into a theory and nontheory part. To this end we need to divide symbols into theory and nontheory symbols. When we deal with a combination of theories we consider as theory symbols those symbols interpreted in at least one of the theories and all other symbols as nontheory symbols. That is, nontheory symbols are uninterpreted in all theories.
A nonequality literal is a theory literal if its predicate symbol is a theory symbol. An equality literal \(t_1 \simeq _s t_2\) is a theory literal, if the sort s is a theory sort. A nontheory literal is any literal that is not a theory literal. A literal is pure if it contains only theory symbols or only nontheory symbols. A clause is fully abstracted, or simply abstracted, if it only contains pure literals. A clause is partially abstracted if nontheory symbols do not appear in theory literals. Note that in partially abstracted clauses theory symbols are allowed to appear in nontheory literals.
A nonvariable term t is called a theory term (respectively nontheory term) if its top function symbol is a theory (respectively nontheory) symbol. When we say that a term is a theory or a nontheory term, we assume that this term is not a variable.
Given a nonabstracted clause \(L[t] \vee C\) where L is a theory literal and t a nontheory term (or the other way around), we can construct the equivalent clause \(L[x] \vee C \vee x\not \simeq t\) for a fresh variable x. Repeated application of this process will lead to an abstracted clause, and doing this only for theory literals will result in a partially abstracted clause. In both cases, the results are unique (up to variable renaming).
The above abstraction process will take \(a+a \simeq 1\), where a is a nontheory symbol, and produce \(x+y \simeq 1 \vee x \not \simeq a \vee y \not \simeq a\). There is a simpler equivalent fully abstracted clause \(x+x \simeq 1 \vee x \not \simeq a\), and we would like to avoid unnecessarily long clauses. For this reason, we will assume that abstraction will abstract syntactically equal subterms using the same fresh variable, as in the above example. If we abstract larger terms first, the result of abstractions will be unique up to variable renaming.
SaturationBased Proof Search (and Theory Reasoning). We introduce our new approach within the context of saturationbased proof search. The general idea in saturation is to maintain two sets of Active and Passive clauses. A saturationloop then selects a clause C from Passive, places C in Active, applies generating inferences between C and clauses in Active, and finally places newly derived clauses in Passive after applying some retention tests. The retention tests involve checking whether the new clause is itself redundant (i.e. a tautology) or redundant with respect to existing clauses.
To perform theory reasoning within this context it is common to do two things. Firstly, to evaluate new clauses to put them in a common form (e.g. rewrite all inequalities in terms of <) and evaluate ground theory terms and literals (e.g. \(1+2\) becomes 3 and \(1<2\) becomes \( false \)). Secondly, as previously mentioned, relevant theory axioms can be added to the initial search space. For example, if the input clauses use the \(+\) symbol one can add the axioms \(x+y \simeq y+x\) and \(x+0 \simeq x\), among others.
3 Generating Simpler Instances
In the introduction, we showed how useful instances can be generated by finding substitutions that make theory literals false. We provide further motivation for the need for instances and then describe a new inference rule capturing this approach.
Theory Instantiation. From the above discussion it is clear that generating instances of theory literals may drastically improve performance of saturationbased theorem provers. The problem is that the set of all such instances can be infinite, so we should try to generate only those instances that are likely not to degrade the performance.
 1.
P contains only pure theory literals;
 2.
\(\lnot {P}\theta \) is valid in \(\mathcal {T}\) (equivalently, \(P\theta \) is unsatisfiable in \(\mathcal {T}\)).
 3.
P contains no literals trivial in \(P \vee D\);
The second condition ensures that \(P\theta \) can be safely removed. This also avoids making a theory literal valid in the theory (a theory tautology) after instantiation. For example, if we had instantiated clause (1) with \(\{ x \mapsto 3\}\) then the clause would have been evaluated to \( true \) (because of \(3<4\)) and thrown away as a theory tautology.
 1.
L is of the form \(x \not \simeq t\) such that x does not occur in t;
 2.
L is a pure theory literal;
 3.
every occurrence of x in C apart from its occurrence in \(x \not \simeq t\) is either in a literal that is not a pure theory literal, or in a literal trivial in C.
It is easy to argue that all pure theory literals introduced by abstraction are trivial.
 1.
abstract relevant literals;
 2.
collect (all) nontrivial pure theory literals \(L_1,\ldots , L_n\);
 3.
run an SMT solver on \(T = \lnot L_1 \wedge \ldots \wedge \lnot L_n\);
 4.if the SMT solver returns

a model, we turn it into a substitution \(\theta \) such that \(T\theta \) is valid in \(\mathcal {T}\);

unsatisfiable, then C is a theory tautology and can be removed.

Note that the abstraction step is not necessary for using \( TheoryInst \), since it will only introduce trivial literals. However, for each introduced theory literal \(x \not \simeq t\) the variable x occurs in a nontheory literal and inferences applied to this nontheory literal may instantiate x to a term s such that \(s \not \simeq t\) is nontrivial. Let us now discuss the implementation of each step in further detail.

strong: Only select strong literals where a literal is strong if it is a negative equality or an interpreted literal.

overlap: Select all strong literals and additionally those theory literals whose variables overlap with a strong literal.

all: Select all nontrivial pure theory literals.
At this point there may not be any pure theory literals to select, in which case the inference will not be applied.
Interacting with the SMT solver. In this step, we replace variables in selected pure theory literals by new constants and negate the literals. Once this has been done, the translation of literals to the format understood by the SMT solver is straightforward (and outlined in [21]). We use Z3 [11] in this work.
Additional care needs to be taken when translating partial functions, such as division. In SMT solving, they are treated as total underspecified functions. For example, when \(\mathcal {T}\) is integer arithmetic with division, interpretations for \(\mathcal {T}\) are defined in such a way that for all integers a, b and interpretation \(\mathcal {I}\), the theory also has the interpretation defined exactly as \(\mathcal {I}\) apart from having \(a/0 = b\). In a way, division by 0 behaves as an uninterpreted function.
To deal with this issue, we assert that \(s \not \simeq 0\) whenever we translate a term of the form t / s. This implies that we do not pass to the SMT solver terms of the form t / 0.
 1.
interpreted symbols that have a fixed interpretation in \(\mathcal {T}\), such as 0 or \(+\);
 2.
other interpreted symbols, such as division;
 3.
variables of T.
 1.
S is satisfiable in \(\mathcal {T}\);
 2.
\(S \rightarrow T\) is valid in \(\mathcal {T}\).
Note that checking that T is satisfiable and returning T as a model satisfies both conditions, but does not give a substitution that can be used to apply the \( TheoryInst \) rule.
Practically, we must evaluate the introduced constants (i.e. those introduced for each of the variables in the above step) in the given model. In some cases, this evaluation fails to give a numeric value. For example, if the result falls out of the range of values internally representable by Vampire or when the value is a proper algebraic number, which currently also cannot be represented internally by our prover. In this case, we cannot produce a substitution and the inference fails.
Theory Tautology Deletion. As we pointed out above, if the SMT solver returns unsatisfiable then C is a theory tautology and can be removed. We only do it when we do not pass to the solver additional assumptions related to division by 0.
4 Abstraction Through Unification
 1.
\(\theta \) is a substitution and D is a (possibly empty) disjunction of disequalities;
 2.
\((D \vee t \simeq s)\theta \) is valid in the underlying theory (and even valid in predicate logic).

interpreted_only: only produce a constraint if the toplevel symbol of both terms is a theory symbol,

one_side_interpreted: only produce a constraint if the toplevel symbol of at least one term is a theory symbol,

one_side_constant: only produce a constraint if the toplevel symbol of at least one term is a theory symbol and the other is an uninterpreted constant,

all: allow all terms of theory sort to unify and produce constraints.
Now given the problem from the introduction involving p(2x) and \(\lnot p(10)\) we can apply ResolutionwA to produce \(2x \not \simeq 10\) which can be resolved using evaluation and equality resolution as before. We note at this point that a further advantage of this updated calculus is that it directly resolves the issue of losing proofs via eager evaluation, e.g. where \(p(1+3)\) is evaluated to p(4), missing the chance to resolve with \(\lnot p(x+3)\).
Implementation. In Vampire, as in most modern theorem provers, inferences involving unification are implemented via term indexing [30]. Therefore, to update how unification is applied we need to update our implementation of term indexing. As the field of term indexing is highly complex we only give a sketch of the update here.
Term indices provide the ability to use a query term t to extract terms that unify (or match, or generalise) with t along with the relevant substitutions. Like many theorem provers, Vampire uses substitution trees [14] to index terms. The idea behind substitution trees is to abstract a term into a series of substitutions required to generate that term and store these substitutions in the nodes of the tree. To search for unifying terms we perform a backtracking search over the tree, composing substitutions from the nodes when descending down edges and checking at each node whether the query term is consistent with the current substitution. This involves unifying subterms of the query term against terms at nodes and a backtrackable result substitution must be maintained to store the results of these unifications. The result substitution must be backtracked as appropriate i.e. when backtracking past the point of unification.
To update this process we do two things. Firstly, wherever previously a unification failed we will produce a set of constraints using Algorithm 1. Secondly, alongside the backtrackable result substitution we maintain a backtrackable stack of constraints so that whenever we backtrack past a point where we made a unification that produced some constraints we remove those constraints from the stack.
5 Experimental Results
We present experimental results evaluating the effectiveness of the new techniques. Our experiments were carried out on a cluster on which each node is equipped with two quad core Intel processors running at 2.4 GHz and 24 GiB of memory.
Comparing New Options. We were interested in comparing how various proof option values affect the performance of a theorem prover. We consider the two new options referred to here by their short names: uwa (unification with abstraction) and thi (theory instantiation). In addition, we consider the boolean option fta (full theory abstraction), applying full abstract to input clauses as implemented in previous versions of Vampire.
Making such a comparison is hard, since there is no obvious methodology for doing so, especially considering that Vampire has over 60 options commonly used in experiments (see [24]). The majority of these options are Boolean, some are finitelyvalued, some integervalued and some range over other infinite domains. The method we used here was based on the following ideas, already described in [17].
 1.
We use a subset of problems with quantifiers and theories from the SMTLIB library [5] (version 20160523) that (i) do not contain bit vectors, (ii) are not trivially solvable, and (iii) are solvable by some approach.
 2.
We repeatedly select a random problem P in this set, a random strategy S and run P on variants of S obtained by choosing possible values for the three options using the same time limit.
Evaluation of the 24 meaningful combination of the three tested options
It may seem surprising that the overall best strategy has all the three options turned off. This is due to what we have observed previously: many SMTLIB problems with quantifiers and theories require very little theory reasoning. Indeed, Vampire solves a large number of problems (including problems unsolvable by existing SMT solvers) just by adding theory axioms and then running superposition with no theoryrelated rules. Such problems do not gain from the new options, because new inference rules result only in more generated clauses. Due to the portfolio approach of modern theorem provers, our focus is on cases where new options are complementary to existing ones.
Let us summarise the behaviour of three options, obtained by a more detailed analysis of our experimental results.
Full Theory Abstraction. Probably the most interesting observation from these results is that the use of full abstraction (fta) results in an observable degradation of performance. This confirms our intuition that unification with abstraction is a good replacement for abstraction. As a result, we will remove the fta option from Vampire.
Unification with Abstraction. This option turned out to be very useful. Many problems had immediate solutions with uwa turned on and no solutions when it was turned off. Further, the value all resulted in 12 unique solutions. We have decided to keep the values all, interpreted_only and off.
Results from finding solutions to previously unsolved problems.
 1.
by reducing the overall schedule time when problems are solved faster or when a single strategy replaces one or more old strategies;
 2.
by solving previously unsolved problems.
While for decidable classes, such as propositional logic, the first way can be more important, in firstorder logic it is usually the second way that matters. The reason is that, if a problem is solvable by a prover, it is usually solvable with a short running time.
We ran Vampire trying to solve, using the new options, problems previously unsolved by Vampire. We took all such problems from the TPTP library [33] and SMTLIB [5] and Table 2 shows the results. In the table, new solutions are meant with respect to what Vampire could previously solve and uniquely solved stands for the number of new problems with respect to what can be solved by other entrants into SMTCOMP^{1} and CASC^{2} where the main competitors are SMT solvers such as Z3 [11] and CVC4 [4] and ATPs such as Beagle [6] and Princess [28, 29].
With the help of the new options Vampire solved 20 problems previously unsolved by any other theorem prover or SMT solver.
6 Related Work
We review relevant related work. A more thorough review can be found in [32].
SMT Solving. SMT solvers such as Z3 [11] and CVC4 [4] implement Ematching [9, 12], model based quantifier instantiation [9, 12] and conflict instantiation [27] to handle quantifiers. Although complete on some fragments, these instantiation techniques are generally heuristic and cannot be directly applied in our setting (see [26]).
In \(DPLL(\varGamma )\) [10] a superposition prover is combined with an SMT solver such that ground literals implied by the SMT solver are used as hypotheses to firstorder clauses.
AVATAR Modulo Theories. Our previous work on AVATAR Modulo Theories [21] uses the AVATAR architecture [23, 34] for clause splitting to integrate an SMT solver with a superposition prover. The general idea is to abstract the clause search space as a SMT problem and use a SMT solver to decide on at least one literal per clause to have in the current search space of the superposition prover. To abstract the clause search space, nonground components (subclauses sharing variables) are abstracted as propositional symbols whilst ground literals are translated directly. The result is that the superposition prover only deals with a set of clauses that is theoryconsistent in its ground part.
Theory Resolution. Stickel’s Theory Resolution [31] is a generalisation of the resolution inference rule whose aim is to exclude the often prolific theory axioms from the explicit participation on reasoning about the uninterpreted part of a given problem. In [32] we show that the theory resolution rule is a redefinition of \(\mathcal {T}\)sound inferences. Given this, it is too abstract per se to bear practical relevance to our approach.
Hierarchic Superposition. Hierarchic Superposition (HS) [3] is a generalisation of the superposition calculus for blackbox style theory reasoning. The approach uses full abstraction to separate theory and nontheory parts of the problem and introduces a conceptual hierarchy between uninterpreted reasoning (with the calculus) and theory reasoning (delegated to a theory solver) by making pure theory terms smaller than everything else. HS guarantees refutational completeness under certain conditions that can be rather restrictive, e.g., the clauses p(x) and \(\lnot p(f(c))\) cannot be resolved if the return sort of function f is a theory sort. The strategy of weak abstractions introduced by Baumgartner and Waldmann [7] partially addresses the downsides of the original approach. However, their approach requires some decisions to be made, for which there currently does not seem to be a practical solution. See [32] for more details.
Other Theory Instantiation. SPASS+T [20] implements a theory instantiation rule that is analogous to Ematching in the sense that it uses ground theory terms from the search space to perform instantiations as a last resort. This is not related to our approach.
Unification Modulo Theories. There is a large amount of work on unification modulo various theories, such as AC. This work is not related since we are not looking for the set of all or most general solutions to unification. Instead, we postpone finding such solutions by creating constraints, which can then be processed by the SMT solver.
7 Conclusion
We have introduced two new techniques for reasoning with problems containing theories and quantifiers. The first technique allows us to utilise the power of SMT solving to find useful instances of nonground clauses. The second technique presents a solution to the issue of full abstraction by lazily abstracting clauses to allow them to unify under theory constraints. Our experimental results show that these approaches can solve problems previously unsolvable by Vampire and other solvers.
There are two directions for future research that we believe will further increase the power of this technique. Firstly, to explore the relationship between this approach and the AVATAR modulo theories work and, secondly, to relax the restriction of theory instantiation to single concrete models.
Footnotes
References
 1.Akbarpour, B., Paulson, L.C.: Extending a resolution prover for inequalities on elementary functions. In: Dershowitz, N., Voronkov, A. (eds.) LPAR 2007. LNCS (LNAI), vol. 4790, pp. 47–61. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540755609_6CrossRefzbMATHGoogle Scholar
 2.Althaus, E., Kruglov, E., Weidenbach, C.: Superposition modulo linear arithmetic SUP(LA). In: Ghilardi, S., Sebastiani, R. (eds.) FroCoS 2009. LNCS (LNAI), vol. 5749, pp. 84–99. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642042225_5CrossRefGoogle Scholar
 3.Bachmair, L., Ganzinger, H., Waldmann, U.: Refutational theorem proving for hierarchic firstorder theories. Appl. Algebra Eng. Commun. Comput. 5, 193–212 (1994)MathSciNetCrossRefGoogle Scholar
 4.Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642221101_14CrossRefGoogle Scholar
 5.Barrett, C., Stump, A., Tinelli, C.: The Satisfiability Modulo Theories Library (SMTLIB) (2010). www.SMTLIB.org
 6.Baumgartner, P., Bax, J., Waldmann, U.: Beagle – a hierarchic superposition theorem prover. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 367–377. Springer, Cham (2015). https://doi.org/10.1007/9783319214016_25CrossRefGoogle Scholar
 7.Baumgartner, P., Waldmann, U.: Hierarchic Superposition with weak abstraction. In: Bonacina, M.P. (ed.) CADE 2013. LNCS (LNAI), vol. 7898, pp. 39–57. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642385742_3CrossRefGoogle Scholar
 8.Bonacina, M.P., Lynch, C., de Moura, L.M.: On deciding satisfiability by theorem proving with speculative inferences. J. Autom. Reasoning 47(2), 161–189 (2011)MathSciNetCrossRefGoogle Scholar
 9.de Moura, L., Bjørner, N.: Efficient Ematching for SMT solvers. In: Pfenning, F. (ed.) CADE 2007. LNCS (LNAI), vol. 4603, pp. 183–198. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540735953_13CrossRefGoogle Scholar
 10.de Moura, L., Bjørner, N.: Engineering DPLL(T) + Saturation. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 475–490. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540710707_40CrossRefGoogle Scholar
 11.de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540788003_24CrossRefGoogle Scholar
 12.Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. J. ACM 52(3), 365–473 (2005)MathSciNetCrossRefGoogle Scholar
 13.Ganzinger, H., Korovin, K.: Theory instantiation. In: Hermann, M., Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 497–511. Springer, Heidelberg (2006). https://doi.org/10.1007/11916277_34CrossRefGoogle Scholar
 14.Graf, P.: Substitution tree indexing. In: Hsiang, J. (ed.) RTA 1995. LNCS, vol. 914, pp. 117–131. Springer, Heidelberg (1995). https://doi.org/10.1007/3540592008_52CrossRefGoogle Scholar
 15.Hoder, K., Reger, G., Suda, M., Voronkov, A.: Selecting the selection. In: Olivetti, N., Tiwari, A. (eds.) IJCAR 2016. LNCS (LNAI), vol. 9706, pp. 313–329. Springer, Cham (2016). https://doi.org/10.1007/9783319402291_22CrossRefGoogle Scholar
 16.Korovin, K., Voronkov, A.: Integrating linear arithmetic into superposition calculus. In: Duparc, J., Henzinger, T.A. (eds.) CSL 2007. LNCS, vol. 4646, pp. 223–237. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540749158_19CrossRefGoogle Scholar
 17.Kovács, L., Robillard, S., Voronkov, A.: Coming to terms with quantified reasoning. In: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, 18–20 January 2017, pp. 260–270. ACM (2017)Google Scholar
 18.Kovács, L., Voronkov, A.: Firstorder theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642397998_1CrossRefGoogle Scholar
 19.Nonnengart, A., Weidenbach, C.: Computing small clause normal forms. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning (in 2 volumes), pp. 335–367. Elsevier and MIT Press, Cambridge (2001)CrossRefGoogle Scholar
 20.Prevosto, V., Waldmann, U.: SPASS+T. In: Proceedings of the FLoC 2006 Workshop on Empirically Successful Computerized Reasoning, 3rd International Joint Conference on Automated Reasoning, vol. 192, CEUR Workshop Proceedings, pp. 19–33 (2006)Google Scholar
 21.Reger, G., Bjørner, N., Suda, M., Voronkov, A.: AVATAR modulo theories. In: 2nd Global Conference on Artificial Intelligence, GCAI 2016, EPiC Series in Computing, vol. 41, pp. 39–52. EasyChair (2016)Google Scholar
 22.Reger, G., Suda, M.: Set of support for theory reasoning. In: IWIL Workshop and LPAR Short Presentations, Kalpa Publications in Computing, vol. 1, pp. 124–134. EasyChair (2017)Google Scholar
 23.Reger, G., Suda, M., Voronkov, A.: Playing with AVATAR. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 399–415. Springer, Cham (2015). https://doi.org/10.1007/9783319214016_28CrossRefGoogle Scholar
 24.Reger, G., Suda, M., Voronkov, A.: The challenges of evaluating a new feature in vampire. In: Proceedings of the 1st and 2nd Vampire Workshops, EPiC Series in Computing, vol. 38, pp. 70–74. EasyChair (2016)Google Scholar
 25.Reger, G., Suda, M., Voronkov, A.: New techniques in clausal form generation. In: 2nd Global Conference on Artificial Intelligence, GCAI 2016, EPiC Series in Computing, vol. 41, pp. 11–23. EasyChair (2016)Google Scholar
 26.Reger, G., Suda, M., Voronkov, A.: Instantiation and pretending to be an SMT solver with Vampire. In: Proceedings of the 15th International Workshop on Satisfiability Modulo Theories, CEUR Workshop Proceedings, vol. 1889, pp. 63–75 (2017)Google Scholar
 27.Reynolds, A., Tinelli, C., de Moura, L.M.: Finding conflicting instances of quantified formulas in SMT. In: Formal Methods in ComputerAided Design, FMCAD 2014, Lausanne, Switzerland, 21–24 October 2014, pp. 195–202 (2014)Google Scholar
 28.Rümmer, P.: A constraint sequent calculus for firstorder logic with linear integer arithmetic. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 274–289. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540894391_20CrossRefzbMATHGoogle Scholar
 29.Rümmer, P.: Ematching with free variables. In: Bjørner, N., Voronkov, A. (eds.) LPAR 2012. LNCS, vol. 7180, pp. 359–374. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642287176_28CrossRefGoogle Scholar
 30.Sekar, R., Ramakrishnan, I., Voronkov, A.: Term indexing. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. II, pp. 1853–1964. Elsevier Science, Amsterdam (2001). Chap. 26CrossRefGoogle Scholar
 31.Stickel, M.E.: Automated deduction by theory resolution. J. Autom. Reasoning 1(4), 333–355 (1985)MathSciNetCrossRefGoogle Scholar
 32.Suda, M., Reger, G., Voronkov, A.: Unification with abstraction and theory instantiation in saturationbased reasoning. EasyChair Preprint no. 1. EasyChair (2017). https://easychair.org/publications/preprint/1
 33.Sutcliffe, G.: The TPTP problem library and associated infrastructure. J. Autom. Reasoning 43(4), 337–362 (2009)MathSciNetCrossRefGoogle Scholar
 34.Voronkov, A.: AVATAR: the architecture for firstorder theorem provers. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 696–710. Springer, Cham (2014). https://doi.org/10.1007/9783319088679_46CrossRefGoogle Scholar
Copyright information
<SimplePara><Emphasis Type="Bold">Open Access</Emphasis> This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.</SimplePara> <SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>