Multi-Completion with Termination Tools

Knuth–Bendix completion is a classical calculus in automated deduction for transforming a set of equations into a confluent and terminating set of directed equations which can be used to decide the induced equational theory. Multi-completion with termination tools constitutes an approach that differs from the classical method in two respects: (1) external termination tools replace the reduction order—a typically critical parameter—as proposed by Wehrman et al. (2006), and (2) multi-completion as introduced by Kurihara and Kondo (1999) is used to keep track of multiple orientations in parallel while exploiting sharing to boost efficiency. In this paper we describe the inference system, give the full proof of its correctness and comment on completeness issues. Critical pair criteria and isomorphisms are presented as refinements together with all proofs. We furthermore describe the implementation of our approach in the tool $\mathsf{mkbTT}$ , present extensive experimental results and report on new completions.


Introduction
In a landmark paper, Knuth and Bendix [12] introduced a completion procedure which aims to transform a set of input equalities into a terminating and confluent rewrite system.If successful, the resulting system allows to decide the associated equational theory.However, success of a completion run critically depends on the reduction order that is required as additional input.
Kurihara and Kondo [15] introduced the calculus of multi-completion which supports completion with multiple reduction orders at the same time.Basically, a deduction simulates parallel completion runs with the orders under consideration but gains efficiency by sharing inference steps among the parallel deductions.Since multi-completion succeeds as soon as one of the mimicked runs achieves a result, this approach partially tackles the problem of choosing an appropriate reduction order.Still, concrete reduction orders have to be provided as input.
Wehrman et al. [32] proposed a different approach.Instead of relying on a reduction order supplied by the user, rewrite rules are oriented by a termination prover internally.In this way an appropriate reduction order is implicitly developed along the deduction.Since modern termination provers employ many more sophisticated techniques than plain reduction orders, this approach allows to construct convergent systems that cannot be obtained with classical completion procedures.One such system is the theory of two commuting group endomorphisms CGE 2 [26], which can be completed by the tool Slothrop described in [32] without user interaction.
Multi-completion with termination tools [21,22,36] constitutes a combination of these two approaches.While the use of termination tools allows for automatic completion without user interaction, multi-completion enables to keep track of multiple combinations of orientations in parallel, thereby exploiting sharing for efficiency reasons.The implementation of our technique thus yields a powerful tool for automatic completion with a high flexibility concerning orientations.In this paper we describe the underlying inference system, and present simulation and correctness results along with the proofs that were omitted in [21,22] due to reasons of space.After detecting a flaw in the fairness definition of [15], we newly contribute a corrected and explicit definition for MKBtt.We also present a novel completeness result for a variant of the MKBtt calculus that (to our knowledge) has not been achieved by previous completion approaches.Refinements such as critical pair criteria and isomorphisms that were already outlined in [36] are described in more detail along with proofs showing their correctness.In a section on implementation details, besides the basic control loop we present optimizations such as term indexing techniques and selection strategies and explain how various options can be controlled by the user.Recent optimizations led to the completion of novel systems such as CGE 5 .By conducting thorough benchmark tests on a considerably extended database we assessed the different enhancements.Extending the experiments presented in previous papers, we also compare with maximal completion as developed by Klein and Hirokawa [11].In short, this article constitutes a comprehensive report of the MKBtt approach and subsumes earlier contributions.
The paper is structured as follows.We start by summarizing preliminaries in Section 2. In Section 3 we present the inference system underlying MKBtt, simulation and correctness results as well as a completeness result concerning a modified version of the calculus.As optimizations, critical pair criteria and isomorphisms are presented in Sections 4 and 5. Section 6 comments on some implementation details of our completion tool before experimental results are described in Section 7. Finally, Section 8 adds some concluding remarks and lists issues for future work.

Preliminaries
We consider terms T (F , V) over a finite signature F and a set of variables V.For some term t we denote by Pos(t) its set of positions, which is partitioned into function symbol positions Pos F (t) and variable positions Pos V (t).If p ∈ Pos(t) then t| p denotes the subterm of t at position p and t[s] p is the term obtained from t when replacing t| p by s.A term t encompasses a term s, denoted by t • s, if t = t[sσ ] p holds for some substitution σ and position p ∈ Pos(t).The strict part • \ • of this relation is denoted by • .We call two terms s and t variants and write s .= t if there exists a variable renaming σ such that sσ = tσ .
Sets of equations between terms will be denoted by E and are assumed to be symmetric.The associated equational theory is denoted by ≈ E .As usual a set of directed equations → r is called a rewrite system and denoted by R, where → R is the associated rewrite relation.We write s →r − − → p t to express that s → R t was achieved by applying the rule → r ∈ R at position p.The relations → + R , → * R and ↔ R denote the transitive, transitive-reflexive and symmetric closure of → R .The smallest equivalence relation containing → R , which coincides with the equational theory ≈ R if R is considered as a set of equations, is denoted by ↔ * R .Subscripts are omitted if the rewrite system or the set of equations is clear from the context.
A rewrite system R is terminating if it does not admit infinite rewrite sequences.It is conf luent if for every peak t * ← s → * u there exists a term v such that t → * v * ← u.An overlap is a triple u → v, p, → r where u → v and → r are rewrite rules without common variables such that p ∈ Pos F (u), u| p and are unifiable with most general unifier σ , and if → r and u → v are variants then p = .The term uσ = uσ [ σ ] p can be rewritten in two different ways, resulting in the critical pair vσ ≈ uσ [rσ ] p .The set of critical pairs among rules in R is denoted by CP(R).A rewrite system R which is both terminating and confluent is called convergent.If R has the property that for every rewrite rule → r the right-hand side r is in normal form and the left-hand side is in normal form with respect to R \ { → r} then R is called reduced.We call R convergent for a set of equations E if R is convergent and ↔ * R coincides with ≈ E .A convergent and reduced rewrite system is called canonical.
A proper order on terms is a rewrite order if it is closed under contexts and substitutions.A well-founded rewrite order is called a reduction order.The relation → + R is a reduction order for every terminating rewrite system R.
In the context of completion, we often consider a pair (E, R) of equations E and rewrite rules R.An equational proof step s ↔ p e t in (E, R) is an equality step if e is an equation ≈ r in E or a rewrite step if e is a rule → r in R, and either s = u[ σ ] p and t = u[rσ ] p or s = u[rσ ] p and t = u[ σ ] p hold for some substitution σ and term u with position p.We sometimes write s ↔ t to express the existence of some proof step, omitting the position p and equation or rule e.An equational proof P of an equation t 0 ≈ t n is a finite sequence of equational proof steps.Note that (E, R) admits an equational proof of s ≈ t if and only if s ↔ * E∪R t holds.A sequence Q of the form t i ↔ • • • ↔ t j with 0 i j n is a subproof of P. We write P [Q] to express that P contains Q as a subproof.If P is an equational proof and σ a substitution then Pσ denotes the instantiated proof For a term u with position q and a proof P of the shape (1) we write u[P] q to denote the sequence which is again an equational proof.Proofs of the shape A proof reduction relation ⇒ additionally satisfies 3. P ⇒ Q holds only if P and Q prove the same equation.

Standard Completion
The classical completion procedure proposed by Knuth and Bendix [12] was reformulated as an inference system by Bachmair [2], as depicted in Fig. 1.The inference system (in the sequel referred to as KB) works on pairs (E, R) consisting of a set of equations E and a set of rewrite rules R, and is parameterized by a reduction order .The inference rules of KB induce a proof transformation relation on the level of equational proofs.For example, if deduce adds a critical pair between rules → r and u → v that overlap on a term w, the peak s ←r ← − − w u→v −−→ t can be replaced by the proof s s≈t ← → t.Similarly, the other inference rules allow to replace patterns in equational proofs.In the sequel, we denote by ⇒ KB the (transitive) proof transformation relation induced by KB using reduction order .This relation terminates and constitutes a proof reduction relation [2].
A KB inference sequence of the form • is in the sequel referred to as a run with persistent equations E ω = i j>i E j and rules R ω = i j>i R j .A run fails if E ω is not empty, it succeeds if E ω is empty and R ω is confluent and terminating.Moreover, a run is called simplifying if R ω is reduced.Since every inference step is reflected by one or more steps in the proof reduction relation ⇒ KB and this relation terminates, in non-failing runs every identity Fig. 1 System KB of standard completion is eventually connected by a rewrite proof, provided that required inference steps are not indefinitely ignored.This property is captured by the notion of fairness.
for which there exists an inference step (E ω , R ω ) (E ω , R ω ) and a proof P in (E ω , R ω ) satisfying P ⇒ P , there also exists a proof Q in (E i , R i ) for some i 0 such that P ⇒ Q holds.
A simpler and sufficient condition states that any run satisfying CP(R ω ) ⊆ i E i is fair.Finally, we recall the main theorems stating correctness and completeness of the inference system KB [4].
Theorem 1 Any non-failing KB run using a reduction order that is fair with respect to ⇒ KB succeeds.
Theorem 2 Assume there exists a f inite convergent system R which has the same equational theory as a set of equations E and is contained in .Then any non-failing run from E using which is fair with respect to ⇒ KB will produce a convergent system in f initely many steps.
With a suitable reduction order a run is thus guaranteed to produce a convergent system, provided that no persistent unorientable equations are encountered.However, a different reduction order might induce an infinite run, or even lead to failure.The choice of the order is thus highly critical for success, but hard to determine in advance.Different approaches have been proposed to tackle this problem.In the following sections we outline two of them.Multi-completion increases the chance for success by keeping track of multiple runs using different orders whereas completion with termination tools attempts to develop a suitable order in the course of the deduction by using modern termination provers, thereby considerably widening the class of applicable orders.

Multi-Completion
Completion with multiple reduction orders-referred to as multi-completion in the sequel-was proposed by Kondo and Kurihara [15].For a set O = { 1 , . . ., n } of orders, it simulates the parallel execution of corresponding completion runs, but shares common inference steps to gain efficiency.The key idea to sharing is a data structure called node.
Definition 2 A node is a tuple s : t, R 0 , R 1 , E where the data s : t consist of terms s, t and the labels R 0 , R 1 , E are subsets of O.The node condition requires that R 0 , R 1 and E are mutually disjoint, s i t holds for all i ∈ R 0 , and t i s for all i ∈ R 1 .
Intuitively, a node s : t, R 0 , R 1 , E captures the state of the term pair s : t in all simulated completion processes.All orders in the equation label E regard the data as an equation s ≈ t while orders in the rewrite labels R 0 and R 1 consider it as rewrite rules s → t and t → s, respectively.Hence the node s : Multi-completion can be described by an inference system MKB which operates on sets of nodes and consists of five rules.Figure 2 shows the orient inference rule.An MKB run γ of the form N 0 N 1 N 2 • • • can be projected to a valid KB run γ i for every order i ∈ O, and conversely every KB run using i can be modelled by an MKB run.Due to these simulation properties also correctness and completeness results are obtained for MKB.For this purpose, a run γ is called fair if it is either finite and γ i is a fair and nonfailing1 KB run for some i, or if it is infinite and all γ i are either fair or failing.

Completion with Termination Tools
Standard completion procedures depend critically on the choice of the reduction order supplied as input, thus requiring a careful decision by the user.The evolution of powerful modern termination provers exploiting a variety of sophisticated methods thus suggests to guarantee termination by employing respective tools instead of a fixed order.Such an approach was proposed by Wehrman, Stump and Westbrook [32] and implemented in the tool Slothrop.Some care has to be taken because it is known [23] that changing the reduction order during a completion run may result in a non-confluent rewrite system.The inference system KBtt underlying Slothrop thus operates on triples (E, R, C) consisting of a set of equations E, a rewrite system R and an additional rewrite system C.This extra constraint system ensures that Fig. 3 The orient rule in KBtt orientations are never reversed throughout a run, thereby guaranteeing confluence of the derived system.
The system KBtt consists of the orient rule depicted in Fig. 3 together with the remaining KB rules where the constraint component is not modified.Again, denotes the inference relation and = its reflexive closure.An empty step Ci constitute a sequence of subsequently refined reduction orders with respect to which completion is performed, naturally exploiting the incrementality of reduction orders defined by a rewrite relation.In this respect the method resembles the approach adopted by the tool REVE [16] where the advantages of an incremental order are emphasized.Any KB run using can obviously be simulated in KBtt since the required termination checks of the constraint systems succeed when employing .Conversely, finite KBtt runs deriving the final constraint system C are reflected by KB runs that use the reduction order → + C .Hence a KBtt run is called fair, successful, failing and simplifying whenever the respective definition applies to the simulated KB run.This entails finite correctness of KBtt [30], although this result does not extend to infinite runs as the infinite union of terminating rewrite systems need not terminate.
Theorem 3 Any f inite non-failing and fair KBtt run succeeds.

Multi-Completion with Termination Tools
In an orient step of KBtt, termination of a new rule s → t together with the set C of all previously oriented rules is checked.If both orientations s → t and t → s terminate together with C, an implementation encounters the challenge how to deal with this choice.Slothrop uses a best-first strategy to decide which branch to explore further.In contrast, MKBtt keeps track of both orientations but avoids an explosion of the search space by integrating the concept of multi-completion to share common inferences.
Since every simulated KBtt branch corresponds to a sequence of decisions on how to orient nodes, a process p is modeled by a bit string in (0 + 1) * .A set of processes P is called well-encoded if there are no pairs of processes p and p in P such that p is a proper prefix of p.The initial process is represented by the empty string .
MKBtt is described by an inference system operating on a set of nodes.In contrast to MKB, labels are now sets of processes instead of reduction orders, and in order to account for the constraint systems required in KBtt, nodes are extended with two additional constraint labels.
Definition 3 An MKBtt node s : t, R 0 , R 1 , E, C 0 , C 1 contains as data two terms s and t and as labels sets of processes R 0 , R 1 , E, C 0 , C 1 , where the node condition requires that R 0 ∪ C 0 , R 1 ∪ C 1 and E are mutually disjoint.
The process sets R 0 , R 1 are called rewrite labels, E is the equation label and C 0 , C 1 are the constraint labels.As in the case of MKB, the node s : The sets of all processes occurring in a node n or a node set N are denoted by P(n) and P(N), respectively.To relate a node set N to the corresponding states of the simulated KBtt processes, projections are used.

Definition 4
For a node n = s : t, R 0 , R 1 , E, C 0 , C 1 and a process p, the equation and rule projection of n to p are defined as The constraint projection C[n, p] is defined analogous to R[n, p].These projections are naturally extended to node sets by defining The inference rules of MKBtt are depicted in Fig. 4. Note that all rules preserve well-encodedness of labels and the disjointness condition on nodes.The following paragraphs add some clarifying remarks on the inference rules.
-The orient rule applied to a node s : t, R 0 , R 1 , E, C 0 , C 1 attempts to turn the equation s ≈ t into a rule for as many processes as possible.This is modelled in the node structure by moving processes p ∈ E to rewrite labels.More precisely, the respective inference rule in KBtt is modelled by checking for every process p ∈ E whether its

An Example
In this section we illustrate multi-completion with termination tools on the example system CGE 2 , which consists of the following equations: An MKBtt run starts with the initial node set x When applying orient to nodes (2) and (3), only the direction from left to right yields valid and terminating rewrite rules.For node (4), both orientations are possible such that process is split into 0 and 1.These three nodes are thus modified as follows: i(x) • x : e, {0, 1}, ∅, ∅, {0, 1}, ∅ x Nodes ( 5) and ( 6) can be oriented in both directions, independent of the orientation of associativity.Now the current node set contains eight processes (constraint labels are omitted for the sake of readability; at this point they coincide with the respective rewrite labels): A rewrite 1 step with node (11) simplifies this node to and adds The former is removed by gc and the latter is oriented to In a similar way, for processes in P 0 the overlap 13) and ( 12) yields a node Additionally, there are critical peaks 21) and ( 12), and 20) and (12).Orienting the ensuing nodes yields Applying rewrite 1 with (11) to node ( 23) creates a node with the same data as ( 24) also for processes in P 0 , such that a subsume step results in the updated node Now the peak i(i(x)) 22) and ( 24) adds after a subsequent orient step.At this point overlaps between ( 26) and ( 25) and ( 26) and ( 12) trigger the creation of nodes that are oriented as x • e : x, P 0 , ∅, ∅, P 0 , ∅ x • i(x) : e, P 0 , ∅, ∅, P 0 , ∅ We obtain the modified node when using node (27) in a rewrite 1 step, together with a new node with data (x • i(y)) • y : x, which is oriented as Node (27) can also be used in rewrite 2 steps to modify ( 22) and ( 25) to and while adding and to the current node set.The latter is oriented into while node ( 33) is subject to a delete inference.The overlaps i(e) ← i(e) • e → e between ( 27) and ( 12) and 13) and ( 28) add i(e) : e, P 0 , ∅, ∅, P 0 , ∅ and to the node set (in the latter case, after rewrite 1 using ( 27) simplifies x • e to x).
To make a long story short, we will only sketch the remainder of the run.After some additional deduce steps, the last node concerning plain group theory is derived, and can again be oriented in both directions, resulting in a split of all current processes.To complete the theory of homomorphisms, new nodes with data f(x) : e, and f(i(x)) : i(f(x)) and similar ones for g are derived.The last kind of nodes gives again rise to process splits.It remains to orient node (16) and consider the critical pair f(x) • (g(y) • z) : g(y) • (f(x) • z) before e.g.process 011110 succeeds with a convergent system after joining all remaining critical pairs: The sequence of orientations gives rise to a process tree, where every branching point corresponds to a process split in an orient step.Part of the process tree developed during the described completion run is sketched in Fig. 5.For all other inference rules the split set is empty.For a step with split set S and p ∈ P(N ), we define the predecessor of p as In Lemmas 1 and 2 we prove that an MKBtt step corresponds to a (possibly nonproper) KBtt step for every process occurring in some node, and every KBtt step can be modelled by MKBtt.Here, = denotes the reflexive closure of the KBtt inference relation .

Lemma 1 For an MKBtt step N N with split set S the KBtt step
is valid for all p ∈ P(N ) such that p = pred S ( p ).Moreover, there exists at least one process p ∈ P(N ) for which the step is not an equality step if the rule applied in N N is not gc or subsume.
Proof By case analysis on the MKBtt rule applied in (38).
-Assume orient with split set S replaced the node n = s : Let p be a process in P(N ) and p = pred S ( p ) be its predecessor with respect to S. We thus have . These sets will in the sequel be denoted by E i , R i and C i , respectively.A further case distinction reveals three possibilities: similar reasoning as in the previous case shows that the simulated inference step is iii.Finally, if p / ∈ R lr ∪ R rl then process p was not affected in this inference step, so p = p and we have The projection of the considered MKBtt inference to process p is thus an identity step.
In all remaining cases p = p holds as no process splitting occurs.
-Whenever delete removes some node s : s, ∅, ∅, E, ∅, ∅ then s ≈ s ∈ E [N, p] for all p ∈ E, and hence delete also applies in KBtt.For all p / ∈ E an identity step is obtained.
-Next, assume rewrite 1 was used.For every process p / ∈ (R 0 ∪ E) ∩ R an identity step is obtained.Otherwise, two cases can be distinguished which are distinct due to the node condition.

i.
If Hence compose can be applied to replace s → t by s → u, which is modelled in MKBtt by moving p from the rewrite label of a node with data s : t to a node with data s : u. ii. If -In the case where rewrite 2 was applied, the inference is an identity step for every process p / then compose or simplify can be applied, as argued in the case for rewrite 1 .
iii.If p ∈ R 1 ∩ R then there are rules → r and t → s in R[N, p] such that the latter can be collapsed into an equation -If gc was applied the step obviously corresponds to an identity step on the level of KBtt for every process p ∈ P(N ), and the same holds for subsume.
Finally, for every inference rule the non-emptiness requirement for the set of affected labels ensures that the strict part holds for at least one p ∈ P(N ).
Lemma 2 Assume for a KBtt inference step (E, R, C) (E , R , C ) there exist a node set N and a process p such that Then there is some inference step N N with split set S and a process p Proof In the following case analysis on the applied KBtt rule, ( * ) refers to the proof obligations -Assume orient was applied to replace an equation s ≈ t ∈ E by the rule s → t ∈ R .Then there must be an node n = s : t, R 0 , R 1 , E, C 0 , C 1 in N such that p ∈ E and C ∪ {s → t} terminates.We distinguish two further cases.If C ∪ {t → s} terminates as well, we set S = {p}.For R lr = {p0} and R rl = {p1} an application of orient yields For p = p0 we have p = pred S ( p ), and ( * ) is satisfied.If C[N, p] ∪ {t → s} does not terminate, we apply orient with S = ∅ and R lr = {p}, which yields Thus we have p = p which trivially satisfies p = pred S ( p ), and again ( * ) holds.
In all remaining cases we can set p = p since no splitting occurs.
-If compose rewrites s → t to s → u using a rule → r, N contains nodes n = s : t, R 0 , R 1 , E, C 0 , C 1 and : r, R, . . .such that p ∈ R 0 ∩ R. Thus rewrite 1 or rewrite 2 applies, depending on whether t .= or t • .We obtain where R 1 and E depend on which inference rule applies.Since p ∈ E ∩ R, ( * ) holds.-Assume collapse is applied to turn a rule t → s into an equation u ≈ s using → r.Then t • must hold, and To satisfy ( * ) we can thus apply rewrite 2 to obtain Since MKBtt steps are reflected in KBtt, an MKBtt run γ of the form N 0 * N corresponds to a valid KBtt run γ p for every process p ∈ P(N).Definition 6 Consider an MKBtt run γ of the form N 0 N 1 • • • N k and some process p ∈ P(N k ).We inductively define the sequence p 0 , . . ., p k of ancestors of p by setting p k = p and p i = pred Si ( p i+1 ) for 0 i < k, where S i is the split set of the step respectively.Then the projected run γ p is the sequence According to Lemma 1, γ p is a valid KBtt run for every process p.
Using projections, the definitions of success, failure and fairness given for KBtt can be naturally extended to MKBtt.
-is fair if γ p is fair and nonfailing for some process p ∈ P(N), -succeeds if E[N, p] = ∅ for some process p ∈ P(N), and -fails if γ p fails for all processes p ∈ P(N).
It is easy to see that MKBtt is sound in the sense that the equational theory is preserved.
Lemma 3 Consider an MKBtt step N N with split set S and a process q ∈ P(N ) with p = pred S (q).The relations As the simulation of KBtt with MKBtt is sound (Lemma 1) and complete (Lemma 2), it is straightforward to establish correctness and completeness using the corresponding results for KBtt.We call an MKBtt run γ : N 0 * N simplifying if the resulting system R[N, p] is reduced whenever γ succeeds for some process p.

Theorem 4
Let N E be the initial node set for a set of equations E and let γ be a f inite non-failing MKBtt run of the form N E * N which is fair for some p ∈ P(N).Then R[N, p] is convergent.
Proof According to Lemma 1 there is a corresponding KBtt run γ p which is nonfailing and fair.Since finite runs of KBtt are correct (Theorem 3), R[N, p] is convergent.

Completeness
Theorem 2 states the completeness of KB in the following sense: If a set of equations E admits an equivalent finite convergent rewrite system R, any fair KB run will produce an equivalent finite convergent system if a reduction order compatible with R is used, provided the run does not fail.The following example shows that MKBtt might even fail if one uses a termination tool T that can prove the termination of R.
Example 1 The convergent rewrite system R consisting of the rules in any fair run of standard completion that uses the reduction order → + R .The system R is easily shown to be terminating with a matrix interpretation of dimension 2; e.g. the termination tool T T T 2 using the strategy matrix -ib 2 -d 2 -direct immediately outputs a termination proof.However, if a KBtt run uses T T T 2 with this strategy and starts by orienting h(a, a) ≈ i(a, a) then no matter which orientation is chosen, one of the equations in the leftmost column remains unorientable.Similarly, if MKBtt starts by applying orient to h(a, a) ≈ i(a, a) then process gets split into 0 and 1.But in subsequent steps neither process can orient both of the equations in the leftmost column, so the run fails.
This example shows that the order in which nodes are processed has considerable influence: orienting nodes too early can prevent KBtt and MKBtt from producing a convergent system even if a successful run exists.Nevertheless, completeness in this sense can be partially obtained in a slightly modified version of MKBtt which we will refer to as MKBtt c .In contrast to the previous version, a process can now also keep an equation unoriented.For this purpose, processes are now viewed as strings in (0 + 1 + −) * .We write T R if the termination tool T can verify termination of the rewrite system R.The orient rule in MKBtt c is given in Fig. 6.Here, split (E lr , E rl , N) replaces every occurrence of a process p ∈ E lr ∩ E rl in a node of N by { p−, p0, p1}, every occurrence of p ∈ E lr \ E rl by { p−, p0} and every occurrence of p ∈ E rl \ E lr by { p−, p1}.The notion of a split set in MKBtt is replaced by split tuple, which refers to the pair of process sets (E lr , E rl ).For all inference steps that use a different rule than orient, the split tuple is (∅, ∅).
To obtain a completeness result for MKBtt c , we require a stronger notion of fairness which requires to equally advance all processes at some point.

Definition 8 Consider an equational proof
) such that P ⇒ KB Q, or -all direct successors q ∈ P(N k+1 ) of p eventually simplify P starting from N k+1 .
Thus a run γ with process p ∈ P(N k ) eventually simplifies a proof P if all successors of p in γ allow for a smaller proof at some point.satisfying P ⇒ KB Q, then every successor q of p either performed an orient step on n and got extended by − in this step, or eventually simplifies P from N k .

Definition 9 (Strong fairness) A run
Here denotes the reduction order → + C [Nk, p] .
Intuitively, a strongly fair run requires all processes to simplify an equational proof if this simplification can be done without process splits (case i).Moreover, if an orient step on, say, a node with data s : t allows for a simplification then all processes except the one that does not orient s : t are required to perform this step (case ii).A sufficient condition for a run to be strongly fair is that all processes are advanced using a breadth-first strategy.
A termination tool T covers some reduction order if for any rewrite system R that is compatible with , T R holds.
) k 0 which uses as reduction order.2. Let E ω and R ω denote the persistent sets of γ p .Suppose P is a proof in which is not a rewrite proof and there exists an inference (E ω , R ω ) (E, R) such that (E, R) admits a proof Q satisfying P ⇒ KB Q.Then there must be a node set N j in γ such that (E[N j , p j ], R[N j , p j ]) contains all equations and rules that are used in P together with those used when simplifying P to Q.By adapting Lemma 2 to MKBtt c , it follows that there is an inference step N j N such that and C[N, p ] ⊆ holds for some successor p of p j , and (E , R ) admits proof Q.
We distinguish two cases.If (E[N j , p j ], R[N j , p j ]) (E , R ) and therefore N j N does not apply orient then no process splitting occurs and p j ∈ P(N).By strong fairness, p j eventually simplifies P. In particular, some successor p m in the process sequence ) such that P ⇒ KB Q .Therefore also γ p allows for this simplified proof.Now suppose (E[N j , p j ], R[N j , p j ]) (E , R ) applied orient to some equation s ≈ t and s t holds.By construction of the sequence ( p k ) k 0 no successor of p j can have obtained − as part of its label when orienting a node with data s : t.Hence, according to strong fairness all successors of p j have to eventually simplify P.So some which proves fairness of this KB run.
The following completeness result shows that an MKBtt c run employing a sufficiently powerful termination prover can produce any convergent system which is derivable in a KB run.
Theorem 5 Consider a f inite canonical rewrite system R which can be constructed from E in a fair KB run using .If T covers then any strongly fair and simplifying MKBtt c run N 0 N 1 N 2 • • • which uses T and does not have a failing process develops some process p Proof According to Lemma 4 there is a sequence of processes ) k 0 is a fair KB run using .By repeating the following argument of [4, Theorem 3.9], we will see that this run succeeds with system R.Each rule → r in R is a theorem in E and therefore will have a persisting rewrite proof after a finite number of steps in every fair and unfailing run.Let R ⊆ i R[N i , p i ] be the set of rules required for proofs of all rules in R.Both R and R are contained in .Hence all these proofs must be of the form → * R r: Suppose r was reducible in R to a term r such that r r .Then there must also be a proof r ↔ * R r as R is a convergent presentation of the theory.But r r implies that r is reducible in R, contradicting the assumption that R is canonical.
Thus → R ⊆ → + R holds.As R and R have the same equational theory, R must be convergent and hence canonical since it was constructed by a simplifying run.Thus R and R have to be equal, because the canonical rewrite system compatible with a given reduction order is unique (up to variable renaming) [19].if a suitable strategy is adopted, but fail with the unorientable equation b ≈ c if the equations are processed in an unfortunate order [7].This example also illustrates that for completeness it is not sufficient to require that T can prove termination of R, or covers → + R ; any run on E where T only supports the reduction order → + R fails immediately.

Critical Pair Criteria
In order to limit the number of deduced equations during a completion run, several critical pair criteria were proposed as a means to filter out critical pairs that can be ignored without compromising completeness [3,10,14,33].In a later work, Bachmair and Dershowitz [4] showed that these criteria match the more general pattern of compositeness.Before describing the use of critical pair criteria in MKBtt, the relevant definitions and some concrete criteria are recalled.We consider a fixed reduction order , a proof order and a proof reduction relation ⇒.For details the reader is referred to [4].

Critical Pair Criteria in Standard Completion
A critical pair criterion CPC is a mapping from sets of equations to sets of equations such that CPC(E) is a subset of CP(E).Intuitively, CPC(E) contains those critical pairs that are considered redundant.A run is fair with respect to CPC if for every peak P associated with a critical pair in CP(R ω ) \ i CPC(R i ∪ E i ) there exists a proof Q in (E i , R i ) for some i > 0 such that P ⇒ Q.A critical pair criterion CPC is correct if a nonfailing run is fair in the general sense whenever it is fair with respect to CPC.Clearly, correct critical pair criteria allow to filter out unnecessary critical pairs without compromising completeness.
An equational proof P that has the form of a peak s ← u → t is composite if there exist terms u 0 , . . ., u n+1 where s = u 0 and t = u n+1 and proofs P 0 , . . ., P n such that P i proves u i ≈ u i+1 and P P i holds for all 1 i n.The compositeness criterion returns all critical pairs among equations in E for which the associated overlaps are composite, which was proven to be correct [3].This very general criterion is hard to apply in practice.However, some of the earlier proposals to filter out superfluous critical pairs in completion procedures actually capture special cases of compositeness.
Kapur et al. [10] introduced the notion of primality for critical pairs.An overlap tσ ← sσ = sσ [uσ ] p → sσ [vσ ] p between rules s → t and u → v in R is prime if sσ is not reducible at some position strictly below p.The primality criterion PCP(R) returns all critical pairs among rules in R for which the associated overlaps are not prime.A special case of PCP is captured by the unblockedness criterion BCP [3].A critical pair originating from an overlap tσ ← sσ = sσ [uσ ] p → sσ [vσ ] p is blocked if xσ is irreducible in R for all variables x ∈ Var(s) ∪ Var(u).The set BCP(R) contains all unblocked critical pairs among rules in R.
Küchlin [14] introduced the notion of connectedness to limit equational consequences deduced in a completion procedure.A critical pair s ≈ t originating from an overlap s ← u → t is connected below u if there exists an equational proof Clearly, if a critical pair s ≈ t is connected below u such that n > 1 then it is also composite.For a practical criterion, Küchlin assumes → ⊆ and concentrates on finding connecting sequences u 1 , . . ., u n−1 such that u → + u i .As a special case the following weak connectivity test is proposed.Given an overlap tσ ← sσ = sσ [uσ ] p → sσ [vσ ] p between rules s → t and u → v, the associated critical pair is connected if there exists a reduction step sσ → w using a rule → r at position q fulfilling the (nonexclusive) properties: (1) if q ∈ Pos F (s) then the critical overlap s → t, q, → r is already considered, (2) if q = pq and q ∈ Pos F (u) then → r, q , u → v is already considered, and (3) if p = qp and p ∈ Pos F (l) then u → v, p , → r is already considered.This criterion is generalized to a full connectivity test where the critical pair is connected via an arbitrary sequence w 0 , . . ., w n instead of a single intermediate term w.In the sequel the connectedness criterion returning connected critical pairs among rules in R will be referred to as CCP(R).
Since both CCP and PCP are special cases of compositeness, these criteria can also be combined.This mixed criterion that filters out critical pairs that are redundant according to one of the criteria will in the sequel be referred to as MCP.

Critical Pair Criteria in MKBtt
In the following paragraphs we describe how critical pair criteria can be integrated into MKBtt.Intuitively, the set E contains all processes in E for which the critical pair derived from o is not superfluous.Thus, in the deduce rule for MKBtt the equation label of the new node is filtered by the criterion as shown in Fig. 7.
Consider a finite MKBtt run γ of the form N 0 * N k and a process p ∈ N k .Let p i denote the ancestor of p in N i and let denote the reduction order → + C [Nk, p] .Then we call γ fair with respect to CPC m and p if the following condition holds: Whenever a node set N i gives rise to an overlap o with critical pair s ≈ t as described in Fig. 7 and  p The run γ is fair with respect to CPC m if it is fair with respect to CPC m and some process p ∈ P(N k ).An MKBtt critical pair Fig. 7 The deduce inference rule using a critical pair criterion criterion CPC m is correct if every finite non-failing run γ that is fair with respect to CPC m is also fair in the sense of Definition 7.

Lemma 5 Every MKBtt critical pair criterion CPC m obtained from a correct criterion CPC is correct.
Proof Let γ be a finite non-failing run of the form N 0 * N k which is fair with respect to CPC m and some process p ∈ P(N k ), and let p i denote the ancestor of p in N i .Assume a critical overlap o where p i ∈ E \ CPC m (o, E, N i ) for some ancestor p i of p.By definition there exists a proof Q in some (E[N j , p j ], R[N j , p j ]) such that we have s ≈ t ⇒ KB Q. Hence the projected run γ p is fair with respect to CPC and by correctness of CPC it is also fair.Thus γ is fair as well.
Thus the use of MKBtt criteria obtained from correct standard criteria does not compromise completeness, and the chance of a run to succeed is not influenced.The following example illustrates the use of critical pair criteria in MKBtt.
The overlap y • i(x • y) → i(x), , e • x → x produces the critical pair i(x) ≈ i(x • e) for the set of processes P 0 ∪ P 1 .When PCP m is applied, it is checked whether there exists a node which allows to reduce the term u = e • i(x • e) at some position below the root.Since node (41) can reduce u at position 21, the critical pair is recognized as redundant for all processes in P 0 such that the deduced node is i(x) : i(x • e), ∅, ∅, P 1 , ∅, ∅ .Furthermore, the overlap i(x) • x → e, 1, i(e) → e between nodes (40) and (42) gives rise to the critical pair e ≈ e • e for the process set P 0 .To reduce the term i(e) • e also node (41) can be applied at the root position.While PCP m is not applicable since the overlap position is below , CCP m requires to check the critical pairs involved in the decomposition.Indeed, the critical pair e ← i(e) • e → i(e) between nodes (40) and ( 41) is already covered by node (42) and the peak involving nodes (41) and (42) can be ignored since it is just a variable overlap.Hence the deduce step is superfluous.
The critical pair criteria PCP, BCP, and CCP require to check whether an overlapped term can be reduced in a certain way other than indicated by the overlap itself.Since MKBtt allows to check reducibility for multiple processes at once, the redundancy checks required for the respective multi-completion criteria can be shared among multiple processes.

Isomorphisms
The performance of our tool mkbTT is significantly affected by the number of simulated processes.On some input problems, runs exhibit similar process pairs which have the same probability of success.
Example 4 A run on CGE 2 may generate a node set N with process p where E[N, p] consists of the equations If an inference step N N applies orient to the equation g(x) • f(y) ≈ f(y) • g(x), the process p is split as both orientations are possible.But the states of the emerging child processes p0 and p1 are the same up to interchanging f and g.Hence further deductions of these processes will be symmetric.
Such similarities between processes are generally captured by isomorphisms.
Lemma 6 Let N p and N q be node sets containing processes p and q such that If there is a step N p N p such that p is the predecessor of p ∈ P(N p ) then there is also an inference step N q = N q and a process q ∈ P(N q ) such that q is the predecessor of q and ) and p ∈ P(N p ), then we can set N q = N q .Otherwise a mirror step N q N q using the same inference rule can model the step for q.More precisely, the mirror step is defined by case distinction on the rule applied in N p N p .
-Assume orient turned a node n = s : Then a node n with data θ(s) : θ(t) has to occur in N q since E[N q , q] ∼ = E[N p , p] holds by assumption.Three further possibilities can be distinguished: Assume p = p and p ∈ E lr \ S because C[N p , p] ∪ {s → t} terminates, but C[N p , p] ∪ {t → s} does not.Thus also C[N q , q] ∪ {θ(s) → θ(t)} terminates, but C[N q , q] ∪ {θ(t) → θ(s)} does not.So the mirror step N q N q can apply orient to node n with E lr = R rl = {q} and E rl = R rl = ∅.In the second case where p = p and p ∈ E rl \ S, we reason symmetrically to the preceding case.For the final case, let p ∈ S. We orient n to obtain the mirror step N q N q .Because C[N p , p] ∼ = C[N q , q] holds, both orientations terminate, so q gets split into q0 and q1 which are then isomorphic to p0 and p1, respectively.
All remaining inference rules do not split processes so p = p and thus q = q .
-Assume delete was applied to a node s : s, ∅, ∅, E, ∅, ∅ where p ∈ E. Then there must be a node n of the form θ(s) : θ(s), ∅, ∅, E , ∅, ∅ in N q such that q ∈ E , and N q N q will be a delete step removing n .-If deduce created a node with data s : t, ∅, ∅, R ∩ R , ∅, ∅ originating from a critical pair involving nodes with terms : r and : r , due to R[N p , p] ∼ = R[N q , q], there must be two nodes with respective data θ( ) : θ(r), R, . . .and θ( ) : θ(r ), R , . . . in N q which allow in a mirror step N q N q to deduce a node θ(s) : θ(t), ∅, ∅, R ∩ R , ∅, ∅ .-If rewrite 1 or rewrite 2 was applied to a node s : t, R 0 , R 1 , E, C 0 , C 1 using a rule node : r, R, . . . to create s : u, R 0 , ∅, E , ∅, ∅ , then the mirror step N q N q can apply the same rule to nodes with data θ(s) : θ(t) and θ( ) : θ(r), which exist by assumption.Then q is removed from the rewrite or equation label of the node with data θ(s) : θ(t), and occurs now in a node with data θ(s) : θ(u).
Theorem 6 Let N i be a set of nodes containing isomorphic processes p i , q i ∈ P(N i ).
Assume there exists an MKBtt completion run γ of the form N i * N k and a process p k ∈ P(N k ) such that p i is the ancestor of p k in N i , and the projected run γ pk is fair and successful.Then there is also a fair deduction γ of the form N i * N k producing a process q k ∈ P(N k ) such that q i is an ancestor of q k , and also γ qk is fair and successful.
Proof We show by induction on the length of γ : N i * N k that there exists a sequence γ : For the case where k = i the processes p k and q k are by assumption isomorphic via some mapping θ .If k > i we consider a sequence N i * N k−1 N k where p i is the ancestor of p k in N i .By the induction hypothesis there is a sequence Hence given γ : Since all intermediate states of p k and q k are isomorphic, γ is fair for q k .If our tool mkbTT detects two isomorphic processes in the current node set N then one process is deleted from all nodes in N. We exploit two concrete shapes of symmetries.Renaming isomorphisms swap function symbols as in Example 4, where p0 and p1 are isomorphic under the mapping θ that exchanges f and g.Argument permutations associate with every function symbol f of arity n a permutation π f of the set {1, . . ., n}.Then the mapping on terms which is defined by θ(x) = x and θ( f (t 1 , . . ., t n )) = f (θ(t π f (1) ), . . ., θ(t π f (n) )) also induces an isomorphism.For example, when completing SK90-3.02[25] a process with state has to orient the associativity axiom.Both orientations preserve termination, but the two child processes emerging from a process split are isomorphic under the argument permutation π + = (1 2).

Implementation
The inference system described in the previous sections is implemented in our tool mkbTT.The tool is written in the programming language OCaml.Binaries and sources are available from the tool's website http://cl-informatik.uibk.ac.at/mkbtt/where also a web interface can be found.In the following sections we provide implementation details which were found to be of special importance.

Control
The basic control of mkbTT is a multi-completion variant of a discount loop, very similar to the one originally proposed for completion with multiple reduction orders [15].Pseudo-code describing the control loop is given in Fig. 8.The procedure advances two node sets containing open nodes No and closed nodes Nc, correspond-Fig.8 Procedure implementing MKBtt ing to passive and active facts.Intuitively, closed nodes have been fully exploited with respect to the orient and rewrite 1,2 inference rules, and to every pair of closed nodes deduce has been applied exhaustively.Therefore, at the beginning of a run Nc is empty whereas No contains the set of initial nodes.
We briefly describe the components occurring in the main control loop.At the beginning of each recursive call it is checked whether some process succeeded.For this purpose, a process p is considered successful if it does not occur in an open node or in a closed equation label, i.e., all of E[Nc, p], R [No, p] and E[No, p] are empty.If no successful process exists but there are open nodes left, choose selects a node n from the set of open nodes.Different selection strategies are considered for this purpose (see Section 6.4).The function rewrite(N, N ) applies rewrite 1,2 to nodes in N by employing nodes in N as rules.Nodes are implemented as mutable structures, so the original objects are modified and only newly created nodes are returned.The function call orient(n, No, Nc) is used to apply the orient inference to node n.It returns the modified node along with the node sets No and Nc adapted by the split operation.Immediately after rewrite and deduce calls, delete is invoked to filter out nodes with equal terms.After having been subjected to rewrite, all labels in a node might become empty.The purpose of gc is to filter out such nodes.The call deduce(n, Nc) returns all equational consequences originating from deduce inference steps involving node n together with some node from Nc.To avoid redundant nodes, the union operation exploits the subsume inference.

Termination Checking
Our tool supports two possibilities for termination checks in orient inference steps.They can either be performed internally by interfacing T T T 2 [13] with the user's favourite termination strategy supplied in T T T 2 's strategy language.Alternatively, any external termination checker can be used which complies to a minimal interface: It must take as argument the name of a file specifying the termination problem in TPDB format 2 and print YES on the first line of the output if termination could be established.

Term Indexing
Automated deduction tools typically spend a major part of computation time on deducing equational consequences and rewriting.A variety of sophisticated term indexing techniques [24] have been developed in order to speed up filtering out matching and unifiable terms.Also mkbTT relies on indexing techniques to quickly sift through nodes that are applicable for inferences.For instance, for deduce inferences the retrieval of unifiable (sub)terms is needed.For rewrite 1 steps, variants have to be retrieved and rewrite 2 requires encompassment retrieval, where the latter can be achieved by repeatedly retrieving subsumptions (also referred to as generalizations in the literature).To select unifiable terms, mkbTT implements path indexing and discrimination trees [9,18], for variant and subsumption retrieval also code trees [28] are supported.

Selection Strategies
An iteration of mkbTT's main control loop starts by selecting a node to process next.For this purpose a choice function picks the node where a given cost heuristic evaluates to a minimal value.The measure applied in this selection has significant impact on performance.To allow for a variety of possibilities, a strategy language is defined that is general enough to cover selection strategies that proved to be useful, but also captures some concepts used in choice strategies of other tools.A strategy is specified by the grammar rule strategy ::= ?| (node,strategy) | float(strategy : strategy) Here node refers to a node property, which is in turn defined via properties of process sets, processes, sets of equations, rewrite systems and term pairs: -Anode property of a node n = s : t, . . ., E, . . .can be its creation time (denoted by * ), a property of the node's data s : t, or a process set property pp of its equation labels E, which is written as el(pp).Node properties can also be added or inverted.-A process set property takes either the sum or the minimum over a property of the involved processes, or may simply be the number of processes it contains, which is denoted by #. -Given a current node set N, a process property of a process p may be an equation set property ep of its equation projection E[N, p] or a rewrite system property tp of either its rule projection R[N, p] or its constraint projection C[N, p], which is expressed by writing e(ep), r(tp) and c(tp), respectively.Process properties can also be added.-An equation set property of a set of equations E can be its cardinality (#) or the sum over a term pair property of the contained equations.A rewrite system property of a rewrite system R can additionally be a property of its set of critical pairs CP(R).-A term pair property of s : t can be the sum |s| + |t| or maximum max{|s|, |t|} of the sizes of the involved terms, which is specified by writing ssum and smax.-Finally, properties are combined to obtain selection strategies.The simplest possible strategy is ?, which is implemented by picking a node randomly.Alternatively, a strategy may combine a node property np with another strategy s to a tuple (np,s).By using this rule multiple times, a node property list of the form (np 1 , . . .(np k ,?) . . . ) can be constructed.To mix strategies, a strategy can also be of the shape r(s 1 :s 2 ), where r is assumed to be a rational value in the closed interval [0, 1].Node property lists are evaluated by mapping each node to the corresponding tuple of integer values, its cost, and choosing the (lexicographic) minimum.In case of a mixed strategy r(s 1 :s 2 ), the strategy s 1 is applied with probability r, and s 2 is used in the remaining cases.
As selection measures have considerable impact, many different strategies for automated reasoning tools have been reported in the literature.For instance, Vampire [29] employs a size/age ratio when deciding on a fact to be processed next.If this ratio is (e.g.) 2 : 3 then out of 5 selections 2 will pick the oldest and 3 the smallest node, i.e., the node where the sum of its term sizes |s| + |t| is minimal.In mkbTT this approach can be achieved with the strategy s size/age (r) = r((data(ssum),?):(* ,?)) where the parameter r ∈ [0, 1] controls the ratio of size-determined selections, e.g., a size/age ratio of 2 : 3 corresponds to r = 0.6.
The "best-first" selection approach applied in Slothrop In mkbTT, the strategies s max and s sum turned out to be beneficial.These strategies first restrict attention to processes where the number of symbols in E[N, p] and C[N, p] is minimal, then select nodes with minimal data and finally go for a node which has the greatest number of processes in its equation label: s sum = (el(min(e(sum(ssum))+c(sum(ssum)))), (data(ssum),(-el(#),?)))) The strategy s max differs from s sum only in that ssum is replaced by smax.To use mkbTT with other heuristics than those just described, a user-defined strategy can be specified via a command line option.

Command Line Interface
The tool mkbTT is equipped with a simple command line interface.It expects an input problem in TPTP3 [27] or TPDB format, where in the latter case both the old textual and the newer XML format3 are supported.
A number of options allow to configure the tool.Both a global and a local timeout in seconds can be specified using -t and -T.The termination prover is given as argument to the -tp option.Alternatively, if T T T 2 is used internally a termination strategy can be supplied with the -s option.A selection strategy can be given with the option -ss, using the grammar described in Section 6.4.
The critical pair criteria BCP, PCP, CCP and MCP can be applied by supplying -cp with arguments blocked, prime, connected, and all, respectively.Isomorphism checks are to be specified via the option -is with optional arguments rename, rename+, perm, or perm+.With the suffix + we compare processes pairs in every iteration, otherwise checks are only performed when a process is split.By default mkbTT applies a heuristic to determine which isomorphism is potentially applicable.
Term indexing techniques used for rewriting and unification may be selected with the options -ix and -ui together with one of nv, pi, dt, or-in the case of rewriting-ct, referring to naive search, path indexing, discrimination trees, and code trees respectively.
The option -kp expects a floating point value larger than 1 and allows to give a process filtering rate.For example, mkbtt -kp 1.2 deletes all processes that exceed the cost of the best process by 20%.
To control output, the flags -ct, -st and -p require mkbTT to print the completed system, statistics and a proof in case of success.Furthermore, the tool offers a checking mode where a file containing a rewrite system supplied via the option -ch is tested for termination, confluence and for allowing rewrite proofs for the the input equalities.
As an example, the call mkbtt -t 600 -T 5 -tp aprove -cp prime WSW06_CGE2.trsFig. 9 Web interface of mkbTT runs the tool on CGE 2 for at most 600 s using a script calling AProVE [8] for termination checks with a timeout of 5 s, and employs the critical pair criterion PCP.

Web Interface
Besides a command line interface, mkbTT can also be executed via a web interface.The screenshot in Fig. 9 provides an impression.Various options may be configured by the user.A global timeout and a timeout for each termination check can be specified.Termination may be either checked inside mkbTT using T T T 2 functions or by applying an external tool.If T T T 2 is used internally, different predefined termination strategies can be selected.This includes basic reduction orders as well as predefined strategies combining dependency pairs, a dependency graph approximation, the subterm criterion, and reduction pairs, which proved to be powerful and fast.In addition, also a user-defined strategy may be supplied in the strategy language of T T T 2 .Alternatively, termination checks can be performed externally with AProVE or MuTerm [1].For the retrieval of encompassments and variants, one of the implemented term indexing techniques can be selected (path indexing, discrimination trees, code trees or naive search in the node set).To restrict the deduced equational consequences, one of the implemented critical pair criteria PCP, BCP, CCP or MCP can be selected.In case of CCP, the weak connectivity test as described in Section 2 is performed.Users can control isomorphism checks by selecting symbol renamings or term permutations, and specify whether these checks are performed repeatedly or only when processes are split.

Experimental Results
We ran experiments on a server equipped with eight dual-core AMD Opteron ® processors 885 running at a clock rate of 2.6GHz with 64GB of main memory.Our test set comprises 101 problems collected from various papers.In the following paragraphs we summarize results obtained for the whole test set, and illustrate our conclusions with selected examples from that database.For this purpose, systems with prefix TPTP refer to theories underlying unit equality problems in TPTP 3.6.0[27], prefix SK90 refers to [25, Section 3], WSW06 refers to the Slothrop [32] distribution, and BGK94 and C89 indicate systems stemming from [5] and [6], respectively.The whole testbench as well as the full experimental data can be obtained from the website.All experiments described in the following tables featured a timeout of 600 s.If a successful completion could not be achieved within that period this is marked by ∞, whereas ⊥ indicates failure.If not stated otherwise, in all of the following experiments the following default settings of mkbTT were used: We interface T T T 2 internally with termination strategy dp-lpo-kbo and a termination timeout of 2 s, apply selection strategy s max , and use the critical pair criterion MCP.We use only renaming isomorphisms, controlled by the auto heuristic.As term indexing techniques code trees and discrimination trees allow to retrieve encompassments and unifiable terms, respectively.Table 1 compares mkbTT with Slothrop, listing the time required for a successful completion in seconds.The last two lines give the number of successes and the average time required to compute a convergent system, respectively.Overall, mkbTT solves about 15% more problems than Slothrop, and achieves completion on average about three times faster.Nevertheless, due to different selection strategies which are favourable for different problems there are also examples where Slothrop can produce a convergent system but our tool (with its default strategy) cannot, such as the system BGK94-M 8 .
Termination Table 2 shows results obtained with different termination strategies when interfacing T T T 2 internally.The strategies dp-kbo, dp-lpo, and dp-lpo-kbo combine dependency pairs, a dependency graph approximation, the subterm criterion and some simple counting techniques with reduction pair processors using KBO, LPO, and both, respectively.The first three strategies apply plain LPO, KBO (with weights of two bits) and linear polynomial interpretations (with coefficients of two bits).Columns (1) show the time required for completion and columns (2) the percentage of time spent on termination.The use of plain reduction orders such as LPO or KBO often results in comparatively fast completions (as e.g. in the cases of SK90-3.04 and SK90-3.27 for LPO and TPTP-GRP493-1 for KBO) because little time is spent on termination checks as can be seen from the bottom line.On the other hand, plain reduction orders have comparatively limited power when it comes to orienting equations, which can prevent success as in case of WS06-proofreduct or the CGE systems.Overall, this results in fewer completions than obtained with more complex strategies that offer a higher flexibility.We could not find a system where polynomial interpretations are beneficial, and the overall success rate is considerably worse.The overall success rate turned out to be best with dp-lpo-kbo, supposedly since a combined strategy can cope best with problems where LPO is beneficial and problems where KBO is preferable.There are systems that can be completed with a plain reduction order but not with a strictly more powerful termination strategy employing dependency pairs, like TPTP-GRP496-1 using KBO.This illustrates that more termination power does not necessarily result in a higher chance for success because many possibilities for orienting equations also induce many processes, which can deteriorate performance significantly.Critical Pair Criteria Table 4 compares results obtained with mkbTT using the primality criterion PCP, the connectedness criterion CCP and the mixed criterion MCP.Columns (1) list the time required for completion, columns (2) the number of redundant critical pairs for the successful process and columns (3) the total number of created nodes.

Selection Strategies
We found a single system, TPTP-GRP490-1, which could only be completed when using PCP, CCP or MCP.For a number of systems the use of critical pair criteria results in a considerably smaller number of nodes and consequently some speedup, as in the cases of C89-A 3 , TPTP-GRP457-1 or TPTP-GRP496-1.This is also reflected in the reduced average time for completion with CCP and MCP.However, there are also examples such as BGK94-D 12 which can no longer be completed when using PCP (although CCP and MCP work), and examples such as TPTP-GRP484-1 where a less powerful criterion results in less control loop iterations and thus a shorter completion time.In these cases the selection strategy s max seems to be influenced by the critical pair criterion in an unfortunate way: the effect of critical pair criteria for a certain system was generally found to depend on the selection strategy.When comparing the three criteria, it turns out that PCP detects the least number of critical pairs, but performs redundancy checks very fast (see the bottom line).When summing up all critical pairs filtered out for successful processes, CCP is twice as effective as PCP.
The criterion BCP is a little less effective than PCP, relevant results can be obtained from the website.Overall MCP turned out to be most beneficial.
Term Indexing Table 5 compares the term indexing techniques implemented in mkbTT to retrieve variants and encompassments.Here nv abbreviates naive filtering of the node database, pi refers to path indexing, dt refers to discrimination trees and ct to code trees.Columns (1) list the time required for completion while columns (2) and (3) give the percentage of time spent on retrieval and rewrite operations, respectively.While all indexing techniques allow to complete the same number of systems, the time consumed by retrieval operations can be reduced significantly when using discrimination trees or code trees.still have to be checked for subsuming the query term.This is not required when using a technique achieving perfect filtering such as code trees.
The bottom lines sum up retrieval times over the whole database and show that although there is little difference concerning variant retrieval (which is negligible anyway) the time for encompassment retrieval can be reduced by more than 90% using discrimination trees or code trees.As expected, maintenance operations such as insertion and removal consume hardly any time.
Concerning the retrieval of unifiable terms in deduce operations, the use of term indexing techniques turned out to be less influential.Compared to naive filtering, discrimination trees decrease the average share of time spent on retrieval from 1 to 0.3%.
Isomorphisms Isomorphism checks can be performed either only on process splits, or repeatedly for every process pair.Table 6 compares both possibilities for renamings (ren) and argument permutations (ap) with the setting where no isomorphism checks are used (a + indicates repeated checks).The setting auto refers to mkbTT's default strategy, which determines at the beginning of a completion run whether the initial equations allow for a nontrivial renaming or argument permutation.In this case repeated checks are performed throughout the deduction.Columns (1) give the time required for completion in seconds, and columns (2) the number of processes emerging in the course of a run.
Renaming checks, especially when performed repeatedly, turned out to be useful for a number of problems, in particular the CGE systems.Also for some string systems like SK90-3.28 and SK90-3.29,and TPTP-GRP011-4 the number of processes could be reduced, although this does not always result in faster completions due to the time required for checking.Argument permutations were useful for just two small systems in the benchmark set, one of which is SK90-3.02where the number of processes could be halved.
On the other hand, especially repeated checks for isomorphisms can be costly if no isomorphic process pairs appear.This is for example the case for SK90-3.28 and the CGE systems when used with argument permutations.Overall the auto setting prevails concerning number of successes and average time, although the heuristic does not always go for the best choice.describing the theory of 3, 4 and 5 commuting group endomorphisms within 18, 145 and 35,796 s, respectively.Our tool also produced the first convergent TRS for the proof reduction system presented in [31].

Conclusion
The present paper reports on multi-completion with termination tools, a completion approach that combines the use of automatic termination provers with completion using multiple reduction orders.The resulting method offers a novel degree of automation as users do not have to supply a suitable order, and provides increased flexibility concerning the orientation of rules.We described the inference system, illustrated the approach by means of an example run, and formally proved its correctness.We also commented on new insights into the method's completeness.Critical pair criteria, a classical means to filter equational consequences in standard completion procedures, were carried over to the present setting.As further improvement isomorphisms were described, and shown to not compromise completeness.We gave a detailed account of the implementation of our approach in the tool mkbTT, reporting on its control flow and implementation issues such as term indexing techniques and selection strategies.An outline of both the web and command line interface provides insight into the tool's usage.We concluded with detailed experimental results which prove the power of multi-completion when comparing to Slothrop and show how the improvements allowed mkbTT to achieve new completions of challenging problems such as systems of the CGE family.Very recently, the tool Maxcomp took a novel approach to completion by encoding the whole process as a satisfiability problem [11].Although currently restricted to reduction orders like LPO, KBO and MPO where orientability can be encoded as a satisfaction problem, the resulting tool is fully automatic in that users do not need to supply a concrete order.Moreover, Maxcomp circumvents the choice of a concrete selection strategy, which is a critical parameter for both mkbTT and Slothrop.It can thus also never fail due to premature orientation, in contrast to mkbTT as illustrated by Example 1.This is reflected in the results presented in Table 7, where Maxcomp solves more problems than our tool when restricting the termination strategy to LPO.Even with its more complex standard termination strategy, mkbTT can complete only a few more problems, although Maxcomp of course fails on challenging problems like WSW06-CGE 2 and WS06-proofreduct which cannot be completed using plain LPO or KBO.However, the difference grows when a benchmark set requiring more sophisticated termination techniques is considered, as shown in Table 8.Here the 3061 problems in TPDB 7.0 were considered which are not already confluent and could be completed by at least one of the tools within 600 s.
The approach underlying mkbTT has been extended to ordered completion [34] although a ground-confluent system is in this setting only obtained when restricting to termination techniques that entail total termination.Completion modulo theories also benefits from a multi-completion approach.Termination tools that support termination analysis modulo theories can be used here.For the important AC case, this approach is implemented in the tool mascott [35].It is to be investigated whether termination tools can be used in other variants of completion, such as normalized completion [17].The extension to other calculi of automated reasoning which classically depend on reduction orders, for example paramodulation [20], is subject to future work, too.

Fig. 5 Definition 5
Fig. 5 Part of a CGE 2 process tree with all branching points leading to process 011110

Lemma 4 1 .
Consider an MKBtt c run γ : N 0 N 1 N 2 • • • which employs a termination tool T covering some reduction order .For every node set N k there exists a process p k such that C[N k , p k ] ⊆ and the sequence * ) is a valid KBtt inference step and C[N k , p k ] ⊆ .We start by setting p 0 = .Now consider an inference step N k N k+1 with split tuple (S 0 , S 1 ).If p k / ∈ S 0 ∪ S 1 then we take p k+1 = p k .By a straightforward adaptation of Lemma 1 to MKBtt c a corresponding KBtt (or empty) step ( * ) is possible, and C[N k , p k ] ⊆ follows from the induction hypothesis.Otherwise, we must have p k ∈ E for an inference step orienting a term pair s : t (adopting the notation used in Fig.6).If s t then T C[N k , p k ] ∪ {s → t} as T covers .In this case we set p k+1 = p k 0. Due to the side condition of orient, p k ∈ S 0 and hence p k+1 ∈ P(N k+1 ).Again ( * ) is a KBtt step and by the choice of p k+1 also C[N k+1 , p k+1 ] ⊆ holds.The argument for the case t s is symmetric.If s and t are incomparable in , we may choose p k+1 = p k −.Then ( * ) is an equality step and C[N k+1 , p k+1 ] ⊆ follows from the induction hypothesis.As the constructed sequence MKBtt c run might still fail if the wrong strategy is chosen.For example, a run on the equations E a ≈ b a≈ c f (b) ≈ b f (a) ≈ d where T covers the lexicographic path order (LPO) with a precedence that satisfies a > b, a > c, b > d and c > d may succeed with the convergent rewrite system R a → d b→ d c→ d f (d) → d

Definition 10
Given a KB critical pair criterion CPC, the corresponding MKBtt critical pair criterion CPC m maps an overlap o with associated critical pair s ≈ t, a set of processes E and a node set N to a process set CPC m (o, E, N) = E such that E ⊆ E and s ≈ t ∈ E[N, p] \ CPC(E[N, p]) for all p ∈ E .
A proof order is a well-founded order on equational proofs such that 1. P Q implies u[Pσ ] p u[Qσ ] p for all terms u, positions p ∈ Pos(u) and substitutions σ , 2. if P and P prove the same equation then P P implies Q[P] Q[P ] for all proofs Q.
current constraint system C[N, p] terminates when extended with s → t or t → s.If C[N, p] ∪ {s → t} terminates then p is added to the set E lr , and if C[N, p] ∪ {t → s} terminates then p is added to the set E rl .The set E lr \ E rl (E rl \ E lr ) thus collects processes which can only perform the orientation s → t (t → s).These processes are added to R 0 and C 0 (R 1 and C 1 ). the rules compose, simplify and collapse create a term pair s : u.If t and are variants, the MKBtt rule rewrite 1 allows to combine respective compose and simplify steps.If t • holds, rewrite 2 simulates all three rules at once.-To increase efficiency, the optional gc rule deletes nodes with empty labels.-Therulesubsume is optional as well, it merges pairs of nodes which have the same data up to renaming.As usual, a sequence of MKBtt inference steps N 0 N 1 N 2 • • • is referred to as a run.Given a set of equations E, the initial node set N 0 = N E consists of all nodes s : t, ∅, ∅, { }, ∅, ∅ such that s ≈ t is in E.
38) is thus a valid orient step in KBtt if p happens to be in E. Since p occurs in R lr , either p ∈ E lr \ E rl or p = p0 for some p ∈ S. If p ∈ E lr \ E rl then p ∈ E follows from p = pred S ( p ) = p and E lr ⊆ E. Otherwise p = pred S ( p ) entails p = p0 such that p ∈ S and because of S ⊆ E also p ∈ E holds.Hence one has E[n, p] = {s ≈ t} and-because of the node condition-R[n, p] = C[n, p] = ∅.Hence the KBtt inference step If t is a variant of we can therefore use rewrite 1 and otherwise rewrite 2 to infer determined by whether rewrite 1 or rewrite 2 is applied, respectively.Note that ( * ) is satisfied.-Ifsimplify reduces an equation s ≈ t to s ≈ u using a rule → r, there are nodes n = s : t, R 0 , R 1 , E, C 0 , C 1 and : r, R, . . . in N such that p ∈ E ∩ R.

Table 2
Different termination strategies (1)le3demonstrates the crucial impact of selection strategies in mkbTT.Columns(1)give the time required for completion and columns (2) the number of control loop iterations (i.e., selected nodes).In line with previous observations from the theorem proving literature, we found that the selection strategy is critical for the success of a run.While for some systems such as SK90-3.07,TPTP-GRP490-1 or the CGE examples it is beneficial to use s max , there are also systems like BGK94-M 8 which can only be solved using s sum , and problems like SK90-3.22 for which a mere size/age ratio works best.It thus seems impossible to determine a single best strategy.Since overall s max could complete most systems and is fastest on average, it is used by default in a specialized and faster implementation.

Table 3
Different selection strategies Table 5 singles out some examples where the gain is especially significant.When comparing the time required for rewrite steps, discrimination trees fall back behind code trees since the retrieved candidate nodes

Table 5
Different term indexing techniques

Table 6
Different isomorphismsNovel Completions While Slothrop was the first completion tool to handle CGE 2 (in more than 200 s), mkbTT can also complete the systems CGE 3 , CGE 4 and CGE 5

Table 8
Comparison of mkbTT and Maxcomp on a subset of TPDB