Keywords

1 Introduction

Automated theorem provers based on equational completion [4], such as Waldmeister, MædMax or Twee [13, 21, 25], routinely outperform superposition-based provers on unit equality problems (UEQ) in competitions such as CASC [22], despite the fact that the superposition calculus was developed as a generalisation of completion to full clausal first-order logic with equality [19]. One of the main ingredients for their good performance is the use of ground joinability criteria for the deletion of redundant equations [1], among other techniques. However, existing proofs of refutational completeness of deduction calculi wrt. these criteria are restricted to unit equalities and rely on proof orderings and proof reductions [1, 2, 4], which are not easily extensible to general clauses together with redundancy elimination.

Since completion provers perform very poorly (or not at all) on non-UEQ problems (relying at best on incomplete transformations to unit equality [8]), this motivates an attempt to transfer those techniques to the superposition calculus and prove their completeness, so as to combine the generality of the superposition calculus with the powerful simplification rules of completion. To our knowledge, no prover for first-order logic incorporates ground joinability redundancy criteria, except for particular theories such as associativity-commutativity (AC) [20].

For instance, if \(f(x,y)\approx f(y,x)\) is an axiom, then the equation \(f(x,f(y,z))\approx f(x,f(z,y))\) is redundant, but this cannot be justified by any simplificaton rule in the superposition calculus. On the other hand, a completion prover which implements ground joinability can easily delete the latter equation wrt. the former. We show that ground joinability can be enabled in the superposition calculus without compromising completeness.

As another example, the simplification rule in completion can use \(f(x)\approx s\) (when \(f(x)\succ s\)) to rewrite \(f(a)\approx t\) regardless of how s and t compare, while the corresponding demodulation rule in superposition can only rewrite if \(s \prec t\). Our “encompassment demodulation” rule matches the former, while also being complete in the superposition calculus.

In [11] we introduced a novel theoretical framework for proving completeness of the superposition calculus, based on an extension of Bachmair-Ganzinger model construction [5], together with a new notion of redundancy called “closure redundancy”. We used it to prove that certain AC joinability criteria, long used in the context of completion [1], could also be incorporated in the superposition calculus for full first-order logic while preserving completeness.

In this paper, we extend this framework to show the completeness of the superposition calculus extended with: (i) a general ground joinability simplification rule, (ii) an improved encompassment demodulation simplification rule, (iii) a connectedness simplification rule extending [3, 21], and (iv) a new ground connectedness simplification rule. The proof of completeness that enables these extensions is based on a new encompassment closure ordering. In practice, these extensions help superposition to be competitive with completion in UEQ problems, and improves the performance on non-UEQ problems, which currently do not benefit from these techniques at all.

We also present a novel incremental algorithm to check ground joinability, which is very efficient in practice; this is important since ground joinability can be an expensive criterion to test. Finally, we discuss some of the experimental results we obtained after implementing these techniques in iProver [10, 16].

The paper is structured as follows. In Sect. 2 we define some basic notions to be used throughout the paper. In Sect. 3 we define the closure ordering we use to prove redundancies. In Sect. 4 we present redundancy criteria for demodulation, ground joinability, connectedness, and ground connectedness. We prove their completeness in the superposition calculus, and discuss a concrete algorithm for checking ground joinability, and how it may improve on the algorithms used in e.g. Waldmeister [13] or Twee [21]. In Sect. 5 we discuss experimental results.

2 Preliminaries

We consider a signature consisting of a finite set of function symbols and the equality predicate as the only predicate symbol. We fix a countably infinite set of variables. First-order terms are defined in the usual manner. Terms without variables are called ground terms. A literal is an unordered pair of terms with either positive or negative polarity, written \(s \approx t\) and \(s\not \approx t\) respectively (we write \(s \mathbin {\dot{\approx }}t\) to mean either of the former two). A clause is a multiset of literals. Collectively terms, literals, and clauses will be called expressions.

A substitution is a mapping from variables to terms which is the identity for all but finitely many variables. An injective substitution onto variables is called a renaming. If e is an expression, we denote application of a substitution \(\sigma \) by \(e\sigma \), replacing all variables with their image in \(\sigma \). Let \({\text {GSubs}}(e) = \{ \sigma \mid e\sigma \text { is ground} \}\) be the set of ground substitutions for e. Overloading this notation for sets we write \({\text {GSubs}}(E) = \{ \sigma \mid \forall e \in E.~ e\sigma \text { is ground} \}\). Finally, we write e.g. \({\text {GSubs}}(e_1,e_2)\) instead of \({\text {GSubs}}(\{e_1,e_2\})\). The identity substitution is denoted by \(\epsilon \).

A substitution \(\theta \) is more general than \(\sigma \) if \(\theta {\rho } = \sigma \) for some substitution \({\rho }\) which is not a renaming. If s and t can be unified, that is, if there exists \(\sigma \) such that \(s\sigma = t\sigma \), then there also exists the most general unifier, written \({{\,\mathrm{mgu}\,}}(s,t)\). A term s is said to be more general than t if there exists a substitution \(\theta \) that makes \(s\theta = t\) but there is no substitution \(\sigma \) such that \(t\sigma = s\). Two terms s and t are said to be equal modulo renaming if there exist injective \(\theta ,\sigma \) such that \(s\theta = t\) and \(t\sigma = s\). The relations “less general than”, “equal modulo renaming”, and their union are represented respectively by the symbols \({\mathrel {\sqsupset }}\), \({\equiv }\), and \({\mathrel {\sqsupseteq }}\).

A more refined notion of instance is that of closure [6]. Closures are pairs \(e \cdot \sigma \) that are said to represent the expression \(e\sigma \) while retaining information about the original term and its instantiation. Closures where \(e\sigma \) is ground are said to be ground closures. Let \({\text {GClos}}(e) = \{ e \cdot \sigma \mid e\sigma \text { is ground} \}\) be the set of ground closures of e. Overloading the notation for sets, if N is a set of clauses then \({\text {GClos}}(N) = \bigcup _{C{\in }N} {\text {GClos}}(C)\).

We write s[t] if t is a subterm of s. If also \(s \ne t\), then it is a strict subterm. We denote these relations by \(s \unrhd t\) and \(s \rhd t\) respectively. We write \(s[t\mapsto t']\) to denote the term obtained from s by replacing all occurrences of t by \(t'\).

A (strict) partial order is a binary relation which is transitive (\(a \succ b \succ c \mathrel {\Rightarrow }a \succ c\)), irreflexive (\(a \nsucc a\)), and asymmetric (\(a \succ b \mathrel {\Rightarrow }b \nsucc a\)). A (non-strict) partial preorder (or quasiorder) is any transitive, reflexive relation. A (pre)order is total over X if \({\forall }x,y\in X.~x \succeq y \vee y \succeq x\). Whenever a non-strict (pre)order \(\succeq \) is given, the induced equivalence relation \(\sim \) is \({\succeq } \cap {\succeq }\), and the induced strict pre(order) \(\succ \) is \({\succeq } \setminus {\sim }\). The transitive closure of a relation \(\succ \), the smallest transitive relation that contains \(\succ \), is denoted by \(\mathrel {\succ ^{+}}\). A transitive reduction of a relation \(\succ \), the smallest relation whose transitive closure is \(\succ \), is denoted by \(\mathrel {\succ ^{-}}\).

For an ordering \({\succ }\) over a set X, its multiset extension \({\mathrel {\succ \succ }}\) over multisets of X is given by: \(A \mathrel {\succ \succ }B\) iff \(A\ne B\) and \({\forall }x\in B .~ B(x)>A(x) ~ {\exists }y\in A .~ y \succ x{\wedge }A(y)>B(y)\), where A(x) is the number of occurrences of element x in multiset A (we also use \(\mathrel {\succ \succ \succ }\) for the the multiset extension of \(\mathrel {\succ \succ }\)). It is well known that the mutltiset extension of a well-founded/total order is also a well-founded/total order, respectively [9]. The (n-fold) lexicographic extension of \(\succ \) over X is denoted \(\succ _\text {lex}\) over ordered n-tuples of X, and is given by \(\langle x_1,\dotsc ,x_n \rangle \succ _\text {lex} \langle y_1,\dotsc ,y_n \rangle \) iff \({\exists }i.~ x_1 = y_1 \mathrel {\wedge } \cdots \mathrel {\wedge } x_{i-1} = y_{i-1} \mathrel {\wedge } x_i \succ y_i\). The lexicographic extension of a well-founded/total order is also a well-founded/total order, respectively.

A binary relation \({\rightarrow }\) over the set of terms is a rewrite relation if (i) \(l \rightarrow r \mathrel {\Rightarrow }l\sigma \rightarrow r\sigma \) and (ii) \(l \rightarrow r \mathrel {\Rightarrow }s[l] \rightarrow s[l\mapsto r]\). The reflexive-transitive closure of a relation is the smallest reflexive-transitive relation which contains it. It is denoted by \({\mathrel {\overset{*}{\rightarrow }}}\). Two terms are joinable (\(s \mathrel {\downarrow } t\)) if \(s \mathrel {\overset{*}{\rightarrow }}u \mathrel {\overset{*}{\leftarrow }}t\).

If a rewrite relation is also a strict ordering, then it is a rewrite ordering. A reduction ordering is a rewrite ordering which is well-founded. In this paper we consider reduction orderings which are total on ground terms, such orderings are also simplification orderings i.e., satisfy \(s \rhd t \mathrel {\Rightarrow }s \succ t\).

3 Ordering

In [11] we presented a novel proof of completeness of the superposition calculus based on the notion of closure redundancy, which enables the completeness of stronger redundancy criteria to be shown, including AC normalisation, AC joinability, and encompassment demodulation. In this paper we use a slightly different closure ordering (\(\succ _{cc}\)), in order to extract better completeness conditions for the redundancy criteria that we present in this paper (the definition of closure redundant clause and closure redundant inference is parametrised by this \(\succ _{cc}\)).

Let \(\succ _t\) be a simplification ordering which is total on ground terms. We extend this first to an ordering on ground term closures, then to an ordering on ground clause closures. Let

(1)

where \(s\sigma \) and \(t{\rho }\) are ground, and let \(\succ _{tc}\) be an (arbitrary) total well-founded extension of \(\succ _{tc'}\). We extend this to an ordering on clause closures. First let

$$\begin{aligned}&{M_{lc}({(s\approx t) \cdot \theta }) = \{ s\theta \cdot \epsilon , t\theta \cdot \epsilon \} ,} \end{aligned}$$
(2)
$$\begin{aligned}&{M_{lc}({(s \not \approx t) \cdot \theta }) = \{ s\theta \cdot \epsilon , t\theta \cdot \epsilon , s\theta \cdot \epsilon , t\theta \cdot \epsilon \} ,} \end{aligned}$$
(3)

and let \(M_{cc}\) be defined as follows, depending on whether the clause is unit or non-unit:

$$\begin{aligned}&{M_{cc}(\emptyset \cdot \theta ) = \emptyset ,} \end{aligned}$$
(4)
$$\begin{aligned}&{M_{cc}((s\approx t) \cdot \theta ) = \{ \{ s \cdot \theta \} , \{ t \cdot \theta \} \} ,}\end{aligned}$$
(5)
$$\begin{aligned}&{M_{cc}((s\not \approx t) \cdot \theta ) = \{ \{ s \cdot \theta , t \cdot \theta , s\theta \cdot \epsilon , t\theta \cdot \epsilon \} \} ,}\end{aligned}$$
(6)
$$\begin{aligned}&{M_{cc}((s\mathbin {\dot{\approx }}t \vee \cdots ) \cdot \theta ) = \{ M_{lc}(L \cdot \theta ) \mid L\in (s\mathbin {\dot{\approx }}t \vee \cdots ) \} ,} \end{aligned}$$
(7)

then \(\succ _{cc}\) is defined by

$$\begin{aligned} C \cdot \sigma \succ _{cc} D \cdot {\rho }&\text {iff } \quad&M_{cc}(C \cdot \sigma ) \mathrel {\succ \succ \succ }_{tc} M_{cc}(D \cdot {\rho }) . \end{aligned}$$
(8)

The main purpose of this definition is twofold: (i) that when \(s\theta \succ _t t\theta \) and u occurs in a clause D, then \(s\theta \lhd u\) or \(s \mathrel {\sqsubset }s\theta = u\) implies \((s\approx t) \cdot \theta {\rho } \prec _{cc} D \cdot {\rho }\), and (ii) that when C is a positive unit clause, D is not, s is the maximal subterm in \(C\theta \) and t is the maximal subterm in \(D\sigma \), then \(s\succeq _tt\) implies \(C \cdot \theta \prec _{cc} D \cdot \sigma \). These two properties enable unconditional rewrites via oriented unit equations on positive unit clauses to succeed whenever they would also succeed in unfailing completion [4], and rewrites on negative unit and non-unit clauses to always succeed. This will enable us to prove the correctness of the simplification rules presented in the following section.

4 Redundancies

In this section we present several redundancy criteria for the superposition calculus and prove their completeness. Recall the definitions in [11]: a clause C is redundant in a set S if all its ground closures \(C \cdot \theta \) follow from closures in \({\text {GClos}}(S)\) which are smaller wrt. \(\succ _{cc}\); an inference is redundant in a set S if, for all \(\theta \in {\text {GSubs}}(C_1,\dots ,C_n,D)\) such that is a valid inference, the closure \(D \cdot \theta \) follows from closures in \({\text {GClos}}(S)\) such that each is smaller than some \(C_1 \cdot \theta ,\dotsc ,C_n \cdot \theta \). These definitions (in terms of ground closures rather than in terms of ground clauses, as in [19]) arise because they enable us to justify stronger redundancy criteria for application in superposition theorem provers, including the AC criteria developed in [11] and the criteria in this section.

Theorem 1

The superposition calculus [19] is refutationally complete wrt. closure redundancy, that is, if a set of clauses is saturated up to closure redundancy (meaning any inference with non-redundant premises in the set is redundant) and does not contain the empty clause, then it is satisfiable.

Proof

The proof of completeness of the superposition calculus wrt. this closure ordering carries over from [11] with some modifications, which are presented in a full version of this paper [12].

4.1 Encompassment Demodulation

We introduce the following definition, to be re-used throughout the paper.

Definition 1

A rewrite via \(l\approx r\) in clause \(C[l\theta ]\) is admissible if one of the following conditions holds: (i) C is not a positive unit, or (let \(C = s[l\theta ]\approx t\) for some \(\theta \)) (ii) \(l\theta \ne s\), or (iii) \(l\theta \mathrel {\sqsupset }l\), or (iv) \(s\prec _tt\), or (v) \(r\theta \prec _tt\).Footnote 1

We then have

(9)

In other words, given an equation \(l\approx r\), if an instance \(l\theta \) is a subterm in C, then the rewrite is admissible (meaning, for example, that an unconditional rewrite is allowed when \(l\theta \succ _tr\theta \)) if C is not a positive unit, or if \(l\theta \) occurs at a strict subterm position, or if \(l\theta \) is less general than l, or if \(l\theta \) occurs outside a maximal side, or if \(r\theta \) is smaller than the other side. This restriction is much weaker than the one given for the usual demodulation rule in superposition [17], and equivalent to the one in equational completion when we restrict ourselves to unit equalities [4].

Example 1

If \(f(x)\succ _ts\), we can use \(f(x)\approx s\) to rewrite \(f(x)\approx t\) when \(s\prec _tt\), and \(f(a)\approx t\), \(f(x)\not \approx t\), or \(f(x)\approx t \vee C\) regardless of how s and t compare.

4.2 General Ground Joinability

In [11] we developed redundancy criteria for the theory of AC functions in the superposition calculus. In this section we extend these techniques to develop redundancy criteria for ground joinability in arbitrary equational theories.

Definition 2

Two terms are strongly joinable (), in a clause C wrt. a set of equations S, if either \(s=t\), or \(s \rightarrow s[l_1\sigma _1 \mathbin {\mapsto }r_1\sigma _1] \mathrel {\overset{*}{\rightarrow }}t\) via rules \(l_i\approx r_i\in S\), where the rewrite via \(l_1\approx r_1\) is admissible in C, or \(s \rightarrow s[l_1\sigma _1 \mathbin {\mapsto }r_1\sigma _1] \downarrow t[l_2\sigma _2 \mathbin {\mapsto }r_2\sigma _2] \leftarrow t\) via rules \(l_i\approx r_i\in S\), where the rewrites via \(l_1\approx r_1\) and \(l_2\approx r_2\) are admissible in C. To make the ordering explicit, we may write . Two terms are strongly ground joinable (), in a clause C wrt. a set of equations S, if for all \(\theta \in {\text {GSubs}}(s,t)\) we have in C wrt. S.

We then have:

figure a

Theorem 2

Ground joinability is a sound and admissible redundancy criterion of the superposition calculus wrt. closure redundancy.

Proof

We will show the positive case first. If , then for any instance \((s\approx t \vee C) \cdot \theta \) we either have \(s\theta =t\theta \), and therefore \(\emptyset \models (s\approx t) \cdot \theta \), or we have wlog. \(s\theta \succ _t t\theta \), with \(s\theta \mathrel {\downarrow } t\theta \). Then \(s\theta \) and \(t\theta \) can be rewritten to the same normal form u by \(l_i\sigma _i \mathbin {\rightarrow }r_i\sigma _i\) where \(l_i\approx r_i\in S\). Since \(u\prec _ts\theta \) and \(u\succeq _tt\theta \), then \((s\approx t \vee C) \cdot \theta \) follows from smaller \((u\approx u \vee C) \cdot \theta \)Footnote 2 (a tautology, i.e. follows from \(\emptyset \)) and from the instances of clauses in S used to rewrite \(s\theta \rightarrow u \leftarrow t\theta \). It only remains to show that these latter instances are also smaller than \((s\approx t \vee C) \cdot \theta \). Since we have assumed \(s\theta \succ _t t\theta \), then at least one rewrite step must be done on \(s\theta \). Let \(l_1\sigma _1 \mathbin {\rightarrow }r_1\sigma _1\) be the instance of the rule used for that step, with \((l_1\approx r_1) \cdot \sigma _1\) the closure that generates it. By Definition 1 and 2, one of the following holds:

  • \(C \ne \emptyset \), therefore \((l_1\approx r_1) \cdot \sigma _1 \prec _{cc} (s\approx t \vee C) \cdot \theta \), or

  • \(l_1\sigma _1 \lhd s\theta \), therefore \(l_1\sigma _1 \prec _t s\theta \mathrel {\Rightarrow }l_1 \cdot \sigma _1 \prec _{tc} s \cdot \theta \mathrel {\Rightarrow }(l_1\approx r_1) \cdot \sigma _1 \prec _{cc} (s\approx t) \cdot \theta \), or

  • \(l_1\sigma _1 = s\theta \) and \(s \mathrel {\sqsupset }l_1\), therefore \(l_1 \cdot \sigma _1 \prec _{tc} s \cdot \theta \mathrel {\Rightarrow }(l_1\approx r_1) \cdot \sigma _1 \prec _{cc} (s\approx t) \cdot \theta \), or

  • \(l_1\sigma _1 = s\theta \) and \(s \equiv l_1\) and \(r_1\sigma _1 \prec _t t\theta \), therefore \(r_1 \cdot \sigma _1 \prec _{tc} t \cdot \theta \mathrel {\Rightarrow }(l_1\approx r_1) \cdot \sigma _1 \prec _{cc} (s\approx t) \cdot \theta \).

As for the remaining steps, they are done on the smaller side \(t\theta \) or on the other side after this first rewrite, which is smaller than \(s\theta \). Therefore all subsequent steps done by any \(l_j\sigma _j \mathbin {\rightarrow }r_j\sigma _j\) will have \(r_j \cdot \sigma _j \prec _{tc} l_j \cdot \sigma _j \prec _{tc} s \cdot \theta \mathrel {\Rightarrow }(l_j\approx r_j) \cdot \sigma _j \prec _{cc} (s\approx t\vee C) \cdot \theta \). As such, since this holds for all ground closures \((s\approx t\vee C) \cdot \theta \), then \(s\approx t \vee C\) is redundant wrt. S.

For the negative case, the proof is similar. We will conclude that \((s\not \approx t \vee C) \cdot \theta \) follows from smaller \((l_i\approx r_i) \cdot \sigma _i\in {\text {GClos}}(S)\) and smaller \((u\not \approx u \vee C) \cdot \theta \). The latter, of course, follows from smaller \(C \cdot \theta \), therefore \(s\not \approx t \vee C\) is redundant wrt. \(S \cup \{C\}\).    \(\square \)

Example 2

If \(S = \{ f(x,y)\approx f(y,x) \}\), then \(f(x,f(y,z))\approx f(x,f(z,y))\) is redundant wrt. S. Note that \(f(x,y)\approx f(y,x)\) is not orientable by any simplification ordering, therefore this cannot be justified by demodulation alone.

Testing for Ground Joinability. The general criterion presented above begs the question of how to test, in practice, whether in a clause \(s{\mathbin {\dot{\approx }}}t{\vee }C\). Several such algorithms have been proposed [1, 18, 21]. All of these are based on the observation that if we consider all total preorders \({\succeq _v}\) on \({\text {Vars}}(s,t)\) and for all of them show strong joinability with a modified ordering—which we denote \(\mathrel {\succ _{t[v]}}\)—then we have shown strong ground joinability in the order \(\succ _t\) [18].

Definition 3

A simplification order on terms \(\succ _t\) extended with a preorder on variables \(\succeq _v\), denoted \(\mathrel {\succeq _{t[v]}}\), is a simplification preorder (i.e. satisfies all the relevant properties in Sect. 2) such that \({\mathrel {\succeq _{t[v]}}} \supseteq {{\succ _t} \cup {\succeq _v}}\).

Example 3

If \(x \succ _v y\), then \(g(x) \mathrel {\succ _{t[v]}} g(y)\), \(g(x) \mathrel {\succ _{t[v]}} y\), \(f(x,y) \mathrel {\succ _{t[v]}} f(y,x)\), etc.

The simplest algorithm based on this approach would be to enumerate all possible total preorders \(\succeq _v\) over \({\text {Vars}}(s,t)\), and exhaustively reduce both sides via equations in S orientable by \(\mathrel {\succ _{t[v]}}\), checking if the terms can be reduced to the same normal form for all total preorders. This is very inefficient since there are \(\mathcal {O}({n!e^{n}})\) such total preorders [7], where n is the cardinality of \({\text {Vars}}(s,t)\). Another approach is to consider only a smaller number of partial preorders, based on the obvious fact that , so that joinability under a smaller number of partial preorders can imply joinability under all the total preorders, necessary to prove ground joinability.

However, this poses the question of how to choose which partial preorders to check. Intuitively, for performance, we would like that whenever the two terms are not ground joinable, that some total preorder where they are not joinable is found as early as possible, and that whenever the two terms are joinable, that all total preorders are covered in as few partial preorders as possible.

Example 4

Let \(S = \{ f(x,f(y,z)){\approx }f(y,f(x,z)) \}\). Then \(f(x,f(y,f(z,f(w,u)))){\approx }f(x,f(y,f(w,f(z,u))))\) can be shown to be ground joinable wrt. S by checking just three cases: \({\succeq _v}\in \{ z\mathord \succ w \mathrel {,}z\mathord \sim w \mathrel {,}z\mathord \prec w \}\), even though there are 6942 possible preorders.

Waldmeister first tries all partial preorders relating two variables among \({\text {Vars}}(s,t)\), then three, etc. until success, failure (by trying a total order and failing to join) or reaching a predefined limit of attempts [1]. Twee tries an arbitrary total strict order, then tries to weaken it, and repeats until all total preorders are covered [21]. We propose a novel algorithm—incremental ground joinability—whose main improvement is guiding the process of picking which preorders to check by finding, during the process of searching for rewrites on subterms of the terms we are attempting to join, minimal extensions of the term order with a variable preorder which allow the rewrite to be done in the \(\succ \) direction.

Our algorithm is summarised as follows. We start with an empty queue of variable preorders, V, initially containing only the empty preorder. Then, while V is not empty, we pop a preorder \(\succeq _v\) from the queue, and attempt to perform a rewrite via an equation which is newly orientable by some extension \(\succeq _v'\) of \(\succeq _v\). That is, during the process of finding generalisations of a subterm of s or t among left-hand sides of candidate unoriented unit equations \(l\approx r\), when we check that the instance \(l\theta \approx r\theta \) used to rewrite is oriented, we try to force this to be true under some minimal extension \(\mathrel {\succ _{t[v']}}\) of \(\mathrel {\succ _{t[v]}}\), if possible. If no such rewrite exists, the two terms are not strongly joinable under \(\mathrel {\succ _{t[v]}}\) or any extension, and so are not strongly ground joinable and we are done. If it exists, we exhaustively rewrite with \(\mathrel {\succ _{t[v']}}\), and check if we obtain the same normal form. If we do not obtain it yet, we repeat the process of searching rewrites via equations orientable by further extensions of the preorder. But if we do, then we have proven joinability in the extended preorder; now we must add back to the queue a set of preorders O such that all the total preorders which are \(\supseteq {\succeq _v}\) (popped from the queue) but not \(\supseteq {\succeq _v'}\) (minimal extension under which we have proven joinability) are \(\supseteq \) of some \({\succeq _v''}\in O\) (pushed back into the queue to be checked). Obtaining this O is implemented by \(\text {order}\_\text {diff}({\succeq _v},{\succeq _v'})\), defined below. Whenever there are no more preorders in the queue to check, then we have checked that the terms are strongly joinable under all possible total preorders, and we are done.

Together with this, some book-keeping for keeping track of completeness conditions is necessary. We know that for completeness to be guaranteed, the conditions in Definition 1 must hold. They automatically do if C is not a positive unit or if the rewrite happens on a strict subterm. We also know that after a term has been rewritten at least once, rewrites on that side are always complete (since it was rewritten to a smaller term). Therefore we store in the queue, together with the preorder, a flag in \(\mathcal {P}(\{ \texttt {L},\texttt {R} \})\) indicating on which sides does a top rewrite need to be checked for completeness. Initially the flag is \(\{ \texttt {L} \}\) if \(s\succ _tt\), \(\{ \texttt {R} \}\) if \(s\prec _tt\), \(\{ \texttt {L},\texttt {R} \}\) if s and t are incomparable, and \(\{ \}\) if the clause is not a positive unit. When a rewrite at the top is attempted (say, \(l\approx r\) used to rewrite \(s=l\theta \) with t being the other side), if the flag for that side is set, then we check if \(l\theta \mathrel {\sqsupset }l\) or \(r\theta \prec t\). If this fails, the rewrite is rejected. Whenever a side is rewritten (at any position), the flag for that side is cleared.

The definition of \(\text {order}\_\text {diff}\) is as follows. Let the transitive reduction of \(\succeq \) be represented by a set of links of the form \(x\mathord \succ y\) / \(x\mathord \sim y\).

figure b

where \({\succeq _1} \subseteq {\succeq _2}\). In other words, we take a transitive reduction of \(\succeq _2\), and for all links \(\ell \) in that reduction which are not part of \(\succeq _1\), we return orders \(\succeq _1\) augmented with the reverse of \(\ell \) and recurse with \({\succeq _1} = {\succeq _1} \cup \ell \).

Example 5

 

\(\succeq _1\)

\(\succeq _2\)

\(\text {order}\_\text {diff}(\succeq _1,\succeq _2)\)

\(x\succ y\)

\(x\succ y\succ z\succ w\)

\(x\succ y\sim z \mathrel {,}x\succ y\prec z \mathrel {,}x\succ y\succ z\sim w \mathrel {,}x\succ y\succ z\prec w\)

\(y\prec x\succ z\)

\(x\succ y\succ z\)

\(x\succ y\sim z \mathrel {,}x\succ z\succ y\)

Theorem 3

For all total \({\succeq _v^T} \supseteq {\succeq _1}\), there exists one and only one \({\succeq _{i}}\in {\{\succeq _2\}} \cup \text {order}\_\text {diff}(\succeq _1,\succeq _2)\) such that \({\succeq _v^T} \supseteq {\succeq _{i}}\). For all \({\succeq _v^T} \nsupseteq {\succeq _1}\), there is no \({\succeq _{i}}\in {\{\succeq _2\}} \cup \text {order}\_\text {diff}(\succeq _1,\succeq _2)\) such that \({\succeq _v^T} \supseteq {\succeq _{i}}\).

Proof

See full version of the paper [12].

An algorithm based on searching for rewrites in minimal extensions of a variable preorder (starting with minimal extensions of the bare term ordering, \(\mathrel {\succ _{t[\emptyset ]}}\)), has several advantages. The main benefit of this approach is that, instead of imposing an a priori ordering on variables and then checking joinability under that ordering, we instead build a minimal ordering while searching for candidate unit equations to rewrite subterms of st. For instance, if two terms are not ground joinable, or not even rewritable in any \(\mathrel {\succ _{t[v]}}\) where it was not rewritable in \(\succ _t\), then an approach such as the one used in Avenhaus, Hillenbrand and Löchner [1] cannot detect this until it has extended the preorder arbitrarily to a total ordering, while our incremental algorithm immediately realises this. We should note that empirically this is what happens in most cases: most of the literals we check during a run are not ground joinable, so for practical performance it is essential to optimise this case.

figure c

Theorem 4

Algorithm 1 returns “Success” only if in C wrt. S.Footnote 3

Proof

We will show that Algorithm 1 returns “Success” if and only if for all total \({\succeq _v^{T}}\) over \({\text {Vars}}(s,t)\), which implies .

When \(\langle {\succeq _v},s,t,c \rangle \) is popped from V, we exhaustively reduce st via equations in S oriented wrt. \(\mathrel {\succ _{t[v]}}\), obtaining \(s^r,t^r\). If \(s^r\sim _{t[v]}t^r\), then , and so for all total \({\succeq _v^T} \supseteq {\succeq _v}\). If \(s^r\not \sim _{t[v]}t^r\), we will attempt to rewrite one of \(s^r,t^r\) using some extended \(\mathrel {\succ _{t[v']}}\) where \({\succeq _v'} \supset {\succeq _v}\). If this is impossible, then for any \({\succeq _v'} \supseteq {\succeq _v}\), and therefore there exists at least one total \(\succeq _v^T\) such that , and we return “Fail”.

If this is possible, then we repeat the process: we exhaustively reduce wrt. \(\mathrel {\succ _{t[v']}}\), obtaining \(s',t'\). If \(s'\not \sim _{t[v']}t'\), then we start again the process from the step where we attempt to rewrite via an extension of \(\succeq _v'\): we either find a rewrite with some \(\mathrel {\succ _{t[v'']}}\) with \({\succeq _v''} \supset {\succeq _v'}\), and exhaustively normalise wrt. \(\mathrel {\succ _{t[v'']}}\) obtaining \(s'',t''\), etc., or we fail to do so and return “Fail”.

If in any such step (after exhaustively normalising wrt. \(\mathrel {\succ _{t[v']}}\)) we find \(s'\sim _{t[v']}t'\), then , and so for all total \({\succeq _v^T} \supseteq {\succeq _v'}\). Now at this point we must add back to the queue a set of preorders \(\succeq _{v\,i}''\) such that: for all total \({\succeq _v^T} \supseteq {\succeq _v}\), either \({\succeq _v^T} \supseteq {\succeq _v'}\) (proven to be ) or \({\succeq _v^T} \supseteq \text {some }\succeq _{v\,i}''\) (added to V to be checked). For efficiency, we would also like for there to be no overlap: no total \({\succeq _v^T} \supseteq {\succeq _v}\) is an extension of more than one of \(\{ \succeq _v',\succeq _{v\,1}'',\dotsc \}\).

This is true because of Theorem 3. So we add \(\{ \langle \succeq _{v\,i}'',s^r,t^r,c^r \rangle \mid {\succeq _{v\,i}''}\in \text {order}\_\text {diff}(\succeq _v,\succeq _v') \}\) to V, where \(c^r = c \setminus (\text {if }s^r\ne s\text { then }\{\texttt {L}\}\) else \(\{\}\)) \(\setminus (\text {if }t^r\ne t\) \(\text {then }\{\texttt {R}\}\text { else }\{\})\). Note also that and , therefore also and if \({\succeq _{v\,i}''} \supset {\succeq _v}\).

During this whole process, any rewrites must pass a completeness test mentioned previously, such that the conditions in the definition of hold. Let \(s_0,t_0\) be the original terms and st be the ones being rewritten and c the completeness flag. If the rewrite is at a strict subterm position, it succeeds by Definition 2. If the rewrite is at the top, then we check c. If \(\texttt {L}\) is unset (\(\texttt {L} \notin c\)), then either \(s\succeq s_0\prec t_0\) or \(s\prec s_0\) or the clause is not a positive unit, so we allow a rewrite at the top of s, again by Definition 2. If \(\texttt {L}\) is set (\(\texttt {L}\in c\)), then an explicit check must be done: we allow a rewrite at the top of s (\(=s_0\)) iff it is done by \(l\sigma \mathbin {\rightarrow }r\sigma \) with \(l\sigma \mathrel {\sqsupset }l\) or \(r\sigma \prec t_0\). Respectively for \(\texttt {R}\), with the roles of s and t swapped.

In short, we have shown that if \(\langle \succeq _v,s',t',c' \rangle \) is popped from V, then V is only ever empty, and so the algorithm only terminates with “Success”, if for all total \({\succeq _v^T} \supseteq {\succeq _v}\). Since V is initialised with \(\langle \emptyset ,s,t,c \rangle \), then the algorithm only returns “Success” if for all total \({\succeq _v^T}\).    \(\square \)

Orienting via Extension of Variable Ordering. In order to apply the ground joinability algorithm we need a way to check, for a given \({\succ _t}\) and \({\succeq _v}\) and some st, whether there exists a \({\succeq _v'} \supset {\succeq _v}\) such that \(s \mathrel {\succ _{t[v']}} t\). Here we show how to do this when \({\succ _t}\) is a Knuth-Bendix Ordering (KBO) [15].

Recall the definition of KBO. Let \(\succ _s\) be a partial order on symbols, w be an \(\mathbb N\)-valued weight function on symbols and variables, with the property that \({\exists }m~{\forall }{x\in \mathcal {V}} .~ w(x)=m\), \(w(c) \ge m\) for all constants c, and there may only exist one unary symbol f with \(w(f)=0\) and in this case \(f \succ _s g\) for all other symbols g. For terms, their weight is \(w(f(s_1,\dotsc )) = w(f) + w(s_1) + \cdots \). Let also \(|s|_{x}\) be the number of occurrences of x in s. Then

figure d

The conditions on variable occurrences ensure that \(s \succ _\text {KBO} t \mathrel {\Rightarrow }{\forall }\theta .~s\theta \succ _\text {KBO} t\theta \).

When we extend the order \(\succ _\text {KBO}\) with a variable preorder \(\succeq _v\), the starting point is that \(x \succ _v y \mathrel {\Rightarrow }x \mathrel {\succ _{\text {KBO}[v]}} y\) and \(x \sim _v y \mathrel {\Rightarrow }x \mathrel {\sim _{\text {KBO}[v]}} y\). Then, to ensure that all the properties of a simplification order (included the one mentioned above) hold, we arrive at the following definition (similar to [1]).

figure e

To check whether there exists a \({\succeq _v'} \supset {\succeq _v}\) such that \(s \mathrel {\succ _{\text {KBO}[v']}} t\), we need to check whether there are some \(x\mathord \succ y\) or \(x\,\mathord =\,y\) relations that we can add to \(\succeq _v\) such that all the conditions above hold (and such that it still remains a valid preorder). Let us denote “there exists a \({\succeq _v'} \supset {\succeq _v}\) such that \(s \mathrel {\succ _{\text {KBO}[v']}} t\)” by \(s \mathrel {\succ _{\text {KBO}[v,v']}} t\). Then the definition is

figure f

This check can be used in Algorithm 1 for finding extensions of variable orderings that orient rewrite rules allowing required normalisations.

4.3 Connectedness

Testing for joinability (i.e. demodulating to \(s\approx s\) or \(s\not \approx s\)) and ground joinability (presented in the previous section) require that each step in proving them is done via an oriented instance of an equation in the set. However, we can weaken this restriction, if we also change the notion of redundancy being used.

As criteria for redundancy of a clause, finding either joinability or ground joinability of a literal in the clause means that the clause can be deleted or the literal removed from the clause (in case of a positive or negative literal, resp.) in any context, that is, we can for example add them to a set of deleted clauses, and for any new clause, if it appears in that set, then immediately remove it since we already saw that it is redundant. The criterion of connectedness [3, 21], however, is a criterion for redundancy of inferences. This means that a conclusion simplified by this criterion can be deleted (or rather, not added), but in that context only; if it ever comes up again as a conclusion of a different inference, then it is not necessarily also redundant. Connectedness was introduced in the context of equational completion, here we extend it to general clauses and show that it is a redundancy in the superposition calculus.

Definition 4

Terms s and t are connected under clauses U and unifier \({\rho }\) wrt. a set of equations S if there exist terms \(v_1,\dotsc ,v_n\), equations \({l_1\approx r_1},\dotsc ,{l_{n-1}\approx r_{n-1}}\), and substitutions \({\sigma _1},\dotsc ,\sigma _{n-1}\) such that:

  1. (i)

    \(v_1 = s\) and \(v_n = t\),

  2. (ii)

    for all \(i\in 1,\dotsc ,n-1\), either \(v_{i+1} = v_i[l_i\sigma _i \mathbin {\mapsto }r_i\sigma _i]\) or \(v_i = v_{i+1}[l_i\sigma _i \mathbin {\mapsto }r_i\sigma _i]\), with \(l_i\approx r_i\in S\),

  3. (iii)

    for all \(i\in 1,\dotsc ,n-1\), there exists w in \(\bigcup _{C{\in }U} \bigcup _{p\mathbin {\dot{\approx }}q {\in } C} \{p,q\}\)Footnote 4 such that for \(u_i\in \{l_i,r_i\}\), either (a) \(u_i\sigma _i \prec w{\rho }\), or (b) \(u_i\sigma _i = w{\rho }\) and either \(u_i \mathrel {\sqsubset }w\) or \(w\in C\) such that C is not a positive unit.

Theorem 5

Superposition inferences of the form

(15)

where \(s[u\mathbin {\mapsto }r]{\rho }\) and \(t{\rho }\) are connected under \(\{ l\approx r\vee C,~ s\approx t\vee D \}\) and unifier \({\rho }\) wrt. some set of clauses S, are redundant inferences wrt. S.

Proof

Let us denote \(s' = s[u\mathbin {\mapsto }r]\). Let also \(U = \{ l\approx r\vee C,~ s\approx t\vee D \}\) and \(M = \bigcup _{C{\in }U} \bigcup _{p\mathbin {\dot{\approx }}q {\in } C} \{p,q\}\). We will show that if \(s'{\rho }\) and \(t{\rho }\) are connected under U and \({\rho }\), by equations in S, then every instance of that inference obeys the condition for closure redundancy of an inference (see, Sect. 4), wrt. S.

Consider any \((s'\approx t \vee C \vee D){\rho } \cdot \theta \) where \(\theta \in {\text {GSubs}}(U{\rho })\). Either \(s'\!{\rho }\theta = t{\rho }\theta \), and we are done (it follows from \(\emptyset \)), or \(s'\!{\rho }\theta \succ t{\rho }\theta \), or \(s'\!{\rho }\theta \prec t{\rho }\theta \).

Consider the case \(s'\!{\rho }\theta \succ t{\rho }\theta \). For all \(i\in 1,\dotsc ,n-1\), there exists a \(C'\in U\) and a \(w\in C'\) such that either (iii.a) \(l_i\sigma _i\theta \prec w{\rho }\theta \), or (iii.b) \(l_i\sigma _i\theta = w{\rho }\theta \) and \(l_i \mathrel {\sqsubset }v\), or (iii.b) \(l_i\sigma _i\theta = w{\rho }\theta \) and \(C'\) is not a positive unit. Likewise for \(r_i\). Therefore, for all \(i\in 1,\dotsc ,n-1\), there exists a \(C'\in U\) such that \((l_i\approx r_i) \cdot \sigma _i\theta \prec C' \cdot {\rho }\theta \). Since \((t\approx t \vee \cdots ){\rho } \cdot \theta \) is also smaller than \((s'\approx t \vee \cdots ){\rho }\cdot {\theta }\) and a tautology, then the instance \((s'\approx t \vee \cdots ){\rho }\cdot {\theta }\) of the conclusion follows from closures in \({\text {GClos}}(S)\) such that each is smaller than one of \((l\approx r\vee C) \cdot {\rho }\theta \), \((s\approx t\vee D) \cdot {\rho }\theta \).

In the case that \(s'\!{\rho }\theta \prec t{\rho }\theta \), the same idea applies, but now it is \((s'\approx s' \vee \cdots ){\rho }\cdot {\theta }\) which is smaller than \((s'\approx t \vee \cdots ){\rho }\cdot {\theta }\) and is a tautology.

Therefore, we have shown that for all \(\theta \in {\text {GSubs}}((l\approx r\vee C){\rho } ,~ (s\approx t\vee D){\rho })\), the instance \((s'\approx t \vee \cdots ){\rho }\cdot {\theta }\) of the conclusion follows from closures in \({\text {GClos}}(S)\) which are all smaller than one of \((l\approx r\vee C) \cdot {\rho }\theta ,~ (s\approx t\vee D) \cdot {\rho }\theta \). Since any valid superposition inference with ground clauses has to have \(l=u\), then any \(\theta '\in {\text {GSubs}}(l\approx r\vee C ,~ s\approx t\vee D ,~ (s'\approx t\vee C\vee D){\rho })\) such that the inference is valid must have \(\theta ' = {\rho }\theta ''\), since \({\rho }\) is the most general unifier. Therefore, we have shown that for all \(\theta '\in {\text {GSubs}}(l\approx r\vee C ,~ s\approx t\vee D ,~ (s'\approx t\vee C\vee D){\rho })\) for which is a valid superposition inference, the instance \((s'\approx t \vee \cdots ){\rho }\cdot {\theta '}\) of the conclusion follows from closures in \({\text {GClos}}(S)\) which are all smaller than one of \((l\approx r\vee C) \cdot \theta ' ,~ (s\approx t\vee D) \cdot \theta '\), so the inference is redundant.    \(\square \)

Theorem 6

Superposition inferences of the form

(16)

where \(s[u\mathbin {\mapsto }r]{\rho }\) and \(t{\rho }\) are connected under \(\{ l\approx r\vee C,~ s\not \approx t\vee D \}\) and unifier \({\rho }\) wrt. some set of clauses S, are redundant inferences wrt. \(S \cup \{(C\vee D){\rho }\}\).

Proof

Analogously to the previous proof, we find that for all instances of the inference, the closure \((s'\not \approx t \vee \cdots ){\rho } \cdot \theta \) follows from smaller closure \((t\not \approx t \vee \cdots ){\rho } \cdot \theta \) or \((s'\not \approx s' \vee \cdots ){\rho } \cdot \theta \) and closures \((l_i\approx r_i) \cdot \sigma _i\theta \) smaller than \(\max \{ (l\approx r\vee C) \cdot \theta \mathrel {,}(s\not \approx t \vee D) \cdot \theta \mathrel {,}(s'\not \approx t\vee C \vee D){\rho } \cdot \theta \}\). But \((t\not \approx t\vee C \vee D){\rho } \cdot \theta \) and \((s'\not \approx s' \vee C \vee D){\rho }\cdot {\theta }\) both follow from smaller \((C\vee D){\rho } \cdot \theta \), therefore the inference is redundant wrt. \(S \cup \{(C\vee D){\rho }\}\).    \(\square \)

4.4 Ground Connectedness

Just as joinability can be generalised to ground joinability, so can connectedness be generalised to ground connectedness. Two terms st are ground connected under U and \({\rho }\) wrt. S if, for all \(\theta \in {\text {GSubs}}(s,t)\), \(s\theta \) and \(t\theta \) are connected under D and \({\rho }\) wrt. S. Analogously to strong ground joinability, we have that if s and t are connected using \(\mathrel {\succ _{t[v]}}\) for all total \(\succeq _v\) over \({\text {Vars}}(s,t)\), then s and t are ground connected.

Theorem 7

Superposition inferences of the form

(17)

where \(s[u\mathbin {\mapsto }r]{\rho }\) and \(t{\rho }\) are ground connected under \(\{ l\approx r\vee C,~ s\approx t\vee D \}\) and unifier \({\rho }\) wrt. some set of clauses S, are redundant inferences wrt. S.

Theorem 8

Superposition inferences of the form

(18)

where \(s[u\mathbin {\mapsto }r]{\rho }\) and \(t{\rho }\) are ground connected under \(\{ l\approx r\vee C,~ s\not \approx t\vee D \}\) and unifier \({\rho }\) wrt. some set of clauses S, are redundant inferences wrt. \(S \cup \{(C\vee D){\rho }\}\).

Proof

The proof of Theorem 7 and 8 is analogous to that of Theorem 5 and 6. The weakening of connectedness to ground connectedness only means that the proof of connectedness (e.g. the \(v_i\), \(l_i\approx r_i\), \(\sigma _i\)) may be different for different ground instances. For all the steps in the proof to hold we only need that for all the instances \(\theta \in {\text {GSubs}}(l\approx r\vee C \mathrel {,}s\mathbin {\dot{\approx }}t\vee D \mathrel {,}(s[u\mathbin {\mapsto }r]\mathbin {\dot{\approx }}t\vee C\vee D){\rho })\) of the inference, \(\theta = \sigma \theta '\) with \(\sigma \in {\text {GSubs}}(s[u\mathbin {\mapsto }r]{\rho },t{\rho })\), which is true.    \(\square \)

Discussion about the strategy for implementation of connectedness and ground connectedness is outside the scope of this paper.

5 Evaluation

We implemented ground joinability in a theorem prover for first-order logic, iProver [10, 16].Footnote 5 iProver combines superposition, Inst-Gen, and resolution calculi. For superposition, iProver implements a range of simplifications including encompassment demodulation, AC normalisation [10], light normalisation [16], subsumption and subsumption resolution. We run our experiments over FOF problems of the TPTP v7.5 library [23] (17 348 problems) on a cluster of Linux servers with 3 GHz 11 core CPUs, 128 GB memory, with each problem running on a single core with a time limit of 300 s. We used a default strategy (which has not yet been fine-tuned after the introduction of ground joinability), with superposition enabled and the rest of the components disabled. With ground joinability enabled, iProver solved 133 problems more which it did not solve without ground joinability. Note that this excludes the contribution of AC ground joinability or encompassment demodulation [11] (always enabled).

Some of the problems are not interesting for this analysis because ground joinability is not even tried, either because they are solved before superposition saturation begins, or because they are ground. If we exclude these, we are left with 10 005 problems. Ground joinability is successfully used to eliminate clauses in 3057 of them (\(30.6\%\), Fig. 1a). This indicates that ground joinability is useful in many classes of problems, including in non-unit problems where it previously had never been used.

Fig. 1.
figure 1

(a) Clauses simplified by ground joinability. (b) % of runtime spent in gr. joinability

In terms of the performance impact of enabling ground joinability, we measure that among problems whose runtime exceeds 1 s, only in 72 out of 8574 problems does the time spent inside the ground joinability algorithm exceed \(20\%\) of runtime, indicating that our incremental algorithm is efficient and suitable for broad application (Fig. 1b).

TPTP classifies problems by rating in [0, 1]. Problems with rating \({\ge }{0.9}\) are considered to be very challenging. Problems with rating 1.0 have never been solved by any automated theorem prover. iProver using ground joinability solves 3 previously unsolved rating 1.0 problems, and 7 further problems with rating in [0.9, 1.0[ (Table 1). We note that some of these latter (e.g. LAT140-1, ROB018-10, REL045-1) were previously only solved by UEQ or SMT provers, but not by any full first-order prover.

Table 1. Hard or unsolved problems in TPTP, solved by iProver with ground joinability.

6 Conclusion and Further Work

In this work we extended the superposition calculus with ground joinability and connectedness, and proved that these rules preserve completness using a modified notion of redundancy, thus bringing for the first time these techniques for use in full first-order logic problems. We have also presented an algorithm for checking ground joinability which attempts to check as few variable preorders as possible.

Preliminary results show three things: (1) ground joinability is applicable in a sizeable number of problems across different domains, including in non-unit problems (where it was never applied before), (2) our proposed algorithm for checking ground joinability is efficient, with over \(\frac{3}{4}\) of problems spending less than \(1\%\) of runtime there, and (3) application of ground joinability in the superposition calculus of iProver improves overall performance, including discovering solutions to hitherto unsolved problems.

These results are promising, and further optimisations can be done. Immediate next steps include fine-tuning the implementation, namely adjusting the strategies and strategy combinations to make full use of ground joinability and connectedness. iProver uses a sophisticated heuristic system which has not yet been tuned for ground joinability and connectedness [14].

In terms of practical implementation of connectedness and ground connectedness, further research is needed on the interplay between those (criteria for redundancy of inferences) and joinability and ground joinability (criteria for redundancy of clauses).

On the theoretical level, recent work [24] provides a general framework for saturation theorem proving, and we will investigate how techniques developed in this paper can be incorporated into this framework.