Blocking and Other Enhancements for Bottom-Up Model Generation Methods

Model generation is a problem complementary to theorem proving and is important for fault analysis and debugging of formal specifications of security protocols, programs and terminological definitions. This paper discusses several ways of enhancing the paradigm of bottom-up model generation. The two main contributions are new, generalized blocking techniques and a new range-restriction transformation. The blocking techniques are based on simple transformations of the input set together with standard equality reasoning and redundancy elimination techniques. These provide general methods for finding small, finite models. The range-restriction transformation refines existing transformations to range-restricted clauses by carefully limiting the creation of domain terms. All possible combinations of the introduced techniques and classical range-restriction were tested on the clausal problems of the TPTP Version 6.0.0 with an implementation based on the SPASS theorem prover using a hyperresolution-like refinement. Unrestricted domain blocking gave best results for satisfiable problems showing it is a powerful technique indispensable for bottom-up model generation methods. Both in combination with the new range-restricting transformation, and the classical range-restricting transformation, good results have been obtained. Limiting the creation of terms during the inference process by using the new range restricting transformation has paid off, especially when using it together with a shifting transformation. The experimental results also show that classical range restriction with unrestricted blocking provides a useful complementary method. Overall, the results showed bottom-up model generation methods were good for disproving theorems and generating models for satisfiable problems, but less efficient than SPASS in auto mode for unsatisfiable problems.


Introduction
The bottom-up model generation (BUMG) paradigm encompasses a wide family of calculi and proof procedures that explicitly try to construct a model of a given clause set by reading clauses as rules and applying them in a bottom-up way until completion. For instance, variants of hyperresolution and grounding tableau calculi belong to this family. BUMG methods have been known for a long time to useful for proving theorems, comparably little effort has however been undertaken to exploit them for the dual task, namely, computing models for satisfiable problems. This is somewhat surprising, as computing models is recognized as being important in software engineering, model checking, and other applications for fault analysis and debugging of logical specifications.
One of the contributions of the paper is the introduction to first-order logic of blocking techniques partially inspired by techniques already successfully used in description and modal logic tableau-based theorem proving , Schmidt and Tishkovsky, 2007, Baader and Sattler, 2001. We adapt and generalize these blocking techniques to full first-order logic. Blocking is an important technique for turning tableau systems into decision procedures for modal and description logics. Though different blocking techniques exist, and not all modal and description logic tableau systems are designed to return models, blocking is essentially a mechanism for systematically merging terms in order to find finite models.
In our approach blocking is encoded on the clausal level and is combined with standard resolution techniques, the idea being that with a suitable prover small, finite models are constructed and can be easily read off from the derived clauses. Our blocking techniques are generic and pose no restrictions on the logic they can be used for. They can even be used for undecidable logics. We introduce four different blocking techniques. The main idea of our blocking techniques is that clauses are added to the input problem which lead in the derivation to splittable clauses causing terms in the partially constructed models to be merged. The difference between the four techniques is how restrictive blocking is. With unrestricted domain blocking domain minimal models can be generated. With subterm domain blocking or subterm predicate blocking larger models are produced because two terms are only merged if one is a subterm of the other. With unrestricted predicate blocking and subterm predicate blocking two terms are merged if they both belong to the extension of a unary predicate symbol, the intention being that less constrained, finite model can be found.
The second contribution of the paper is a refinement of the well-known 'transformation to range-restricted form' as introduced in the eighties by Manthey and Bry [1988] in the context of the SATCHMO prover and later improved, for example, by Baumgartner et al. [1997]. These range-restricting transformations have the disadvantage that they generally force BUMG methods to enumerate the entire Herbrand universe and are therefore non-terminating except in the simplest cases. One solution is to combine classical range-restriction transformations with blocking techniques. Another solution, presented in this paper, is to modify the range-restricting transformation so that new terms are created only when needed. Our method extends and combines the range-restricting transformation introduced in Schmidt and Hustadt [2005] for reducing first-order formulae and clauses into range-restricted clauses, which was used to develop general-purpose resolution decision procedures for the Bernays-Schönfinkel class.
Other methods for model computation can be classified as methods that directly search for a finite model, such as the extended PUHR tableau method of Bry and Torge [1998], the method of Bezem [2005] and the methods in the SEM-family [Slaney, 1992, Zhang, 1995, McCune, 2003. In contrast, MACEstyle model builders such as, for example, the methods of Claessen and Sörensson [2003] and McCune [1994] reduce model search to testing of propositional satisfi-ability. Being based on a translation, the MACE-style approach is conceptually related, but different to our approach. Both SEM-and MACE-style methods search for finite models, by essentially searching the space of interpretations with domain sizes 1, 2, . . ., in increasing order, until a model is found.
Our method operates significantly differently, as it is not parameterized by a domain size. Consequently, there is no requirement for iterative deepening over the domain size, and the search for finite models works differently. This way, we can address a problem often found with models computed by these methods: from a pragmatic perspective, they tend to identify too many terms. For instance, for the two unit clauses P(a) and Q(b) there is a model that identifies a and b with the same object. Such models can be counter-intuitive, for instance, in a description logic setting, where unique names are often assumed, but not necessarily explicitly specified. Furthermore, logic programs are typically understood with respect to Herbrand semantics, and it is desirable to develop compatible model building techniques. We present transformations that are more careful at identifying objects than the methods mentioned and thus work closer to a Herbrand semantics.
The structure of the paper is as follows. Definitions of basic terminology and notation can be found in Section 2. In Section 3 we recall the characteristic properties of BUMG methods. The main part of the paper are Sections 4 to 9. Sections 4, 5 and 6 define new techniques for generating small models and generating them more efficiently. The techniques are based on a series of transformations including a refined range-restricting transformation (Section 4), instances of standard renaming and flattening (Section 5), and the introduction of blocking in various forms through amendments of the clause set and standard saturation-based equality reasoning (Section 6). Soundness and completeness of the blocking transformations and the combined transformations is shown in Section 7. One consequence of the results is a general decidability result of the Bernays-Schönfinkel class for all BUMG methods and related approaches. This is presented in Section 8. In Section 9 we present and discuss results of experiments carried out with our methods on clausal problems in the TPTP library.
This paper is an extended and improved version of Baumgartner and Schmidt [2006].

Basic Definitions
We use standard terminology from automated reasoning. We assume as given a signature Σ = Σ f ∪ Σ P of function symbols Σ f (including constants) and predicate symbols Σ P . As we are working (also) with equality, we assume Σ P contains a distinguished binary predicate symbol ≈, which is used in infix form. Terms, atoms, literals and formulas over Σ and a given (denumerable) set of variables V are defined as usual.
A clause is a (finite) implicitly universally quantified disjunction of literals. We write clauses in a logic-programming style, that is, we write H 1 ∨· · ·∨H m ← B 1 ∧ · · · ∧ B k rather than H 1 ∨ · · · ∨ H m ∨ ¬B 1 ∨ · · · ∨ ¬B k , where m, k ≥ 0. Each H i is called a head atom, and each B j is called a body atom. When writing expressions such as H ∨ H ← B ∧ B we mean any clause whose head literals are H and those in the disjunction of literals H, and whose body literals are B and those in the conjunction of literals B. A clause set is a finite set of clauses.
A clause H ← B is said to be range-restricted iff the body B contains all the variables in it. This means that a positive clause H ← ⊤ is range-restricted only if it is a ground clause. A clause set is range-restricted iff it contains only range-restricted clauses.
For a given atom P (t 1 , . . . , t n ) the terms t 1 , . . . , t n are also called the toplevel terms of P (t 1 , . . . , t n ) (P being ≈ is permitted). This notion generalizes to clause bodies, clause heads and clauses as expected. For example, for a clause H ← B the top-level terms of its body B are exactly the top-level terms of its body atoms.
A proper functional term is a term that is neither a variable nor a constant. A (Herbrand) interpretation I is a set of ground atoms, namely, those that are true in the interpretation. Satisfiability/validity in a Herbrand interpretation of ground literals, clauses, and clause sets is defined as usual. Also, as usual, a clause set stands semantically for the set of all its ground instances. We write I |= F to denote that I satisfies F , where F is a ground literal or a (possibly non-ground) clause (set).
An E-interpretation is an interpretation that is also a congruence relation on the terms in the signature. If I is an interpretation, we denote by I E the smallest congruence relation on the terms that includes I, which is an E-interpretation. An E-interpretation does not necessarily need to be a Herbrand-E-interpretation and is a standard first-order interpretation I such that (I, µ) |= s ≈ t if and only if (I, µ)(s) = (I, µ)(t) (where µ is a valuation, that is, a mapping from the variables to the domain |I| of I). We say that It is well-known that E-interpretations can be characterized by fixing the domain as the Herbrand universe and requiring that for every ground term t, t ≈ t ∈ I, and for every ground atom A (including ground equations) the following is true: whenever I |= A[s] and I |= s ≈ t, then I |= A [t].
Another characterization is to add to a given clause set M its equality axioms EAX(Σ P ∪Σ f ), that is, the axioms expressing that ≈ is a congruence relation on the terms and atoms induced by the predicate symbols Σ P and function symbols Σ f occurring in M . It is well-known that M is E-satisfiable iff M ∪EAX(Σ P ∪Σ f ) is satisfiable.
We work mostly, but not always, with Herbrand interpretations. If not, we always make this clear, and the interpretations considered then are first-order logic interpretations with domains that are (proper) subsets of the Herbrand universe of the clause set under consideration. Such interpretations are called quasi-Herbrand interpretations. When constructing such interpretations the requirement that function symbols are interpreted as total functions over their domain is not always trivially satisfied. For instance, in the presence of a constant a, a unary function symbol f , and the domain {a, f (a)}, say, one has to assign a value in the interpretation to every term. However f (f (a)), for instance, cannot be assigned to itself, as f (f (a)) is not contained in the domain.

BUMG Methods
Proof procedures based on model generation approaches establish the satisfiability of a problem by trying to build a model for the problem. In this paper we are interested in bottom-up model generation approaches (BUMG). BUMG approaches use a forward reasoning approach where implications or clauses, H ← B, are read as rules and are repeatedly used to derive (instances of) H from (instances of) B until a completion is found.
Hyperresolution consists of two inference rules, hyperresolution and factoring. The hyperresolution rule applies to a non-positive clause H ← B 1 ∧. . . ∧B n (n = 0) and n positive clauses where σ is the most general unifier of B and B ′ . On range-restricted clauses, when using hyperresolution, factoring amounts to the elimination of duplicate literals in positive clauses and is therefore optional when clauses are viewed as sets.
A crucial requirement for the effective use of blocking (considered later in Section 6) is support of equality reasoning (for example, ordered paramodulation, ordered rewriting or superposition Ganzinger, 1998, Nieuwenhuis andRubio, 2001]), in combination with simplification techniques based on orderings. We refer to Ganzinger [1998, 2001] for general notions of redundancy in saturation-based theorem proving approaches.
Our experiments show the splitting rule is useful for BUMG. For our blocking transformations, splitting on the positive part of (ground) clauses is in fact mandatory to make it effective. This type of splitting replaces the branch of a derivation containing a positive clause C ∨ D ← ⊤, say, by two copies of the branch in which the clause is replaced by C ← ⊤ and D ← ⊤, respectively, provided that C and D do not share any variables. Most BUMG procedures support this splitting technique, in particular, the provers we have used do.

Range-Restricting Transformations
Existing transformations to range-restricted form follow Manthey and Bry [1988] (or are variations of it). The transformation can be defined by a procedure carrying out the following steps on a given set M of clauses. (1) Add a constant. Let dom be a 'fresh' unary predicate symbol not in Σ P , and let c be some constant. Extend crr(M ) by the clause The constant c can be 'fresh' or belong to Σ f .
We refer to this clause as the clause corresponding to H ← B.
(3) Enumerate the Herbrand universe. For each n-ary f ∈ Σ f , add the clauses: The computed set crr(M ) is the classical range-restricting transformation of M . It is not difficult to see that crr(M ) is indeed range-restricted for any clause set M . The transformation is sound and complete, that is, M is satisfiable iff crr(M ) is satisfiable [Manthey andBry, 1988, Bry andYahya, 2000]. The size of crr(M ) is linear in the size of M and can be computed in linear time.
Perhaps the easiest way to understand the transformation is to imagine we use a BUMG method, for example, hyperresolution. The idea is to build the model(s) during the derivation. The clause added in Step (1) ensures that the domain of interpretation given by the domain predicate dom is non-empty.
Step (2) turns clauses into range-restricted clauses. This is done by shielding the variables {x 1 , . . . , x k } in the head, that do not occur negatively, with the added negative domain literals. Clauses that are already range-restricted are unaffected by this step.
Step (3) ensures that all elements of the Herbrand universe of the (original) clause set are added to the domain via hyperresolution inference steps.
As a consequence a clause set M with at least one non-nullary function symbols causes hyperresolution derivations to be unbounded for crr(M ), unless M is unsatisfiable. This is a negative aspect of the classical range-restricting transformation. However, the method has been shown to be useful for (domain-)minimal model generation when combined with other techniques Yahya, 2000, Bry andTorge, 1998]. In particular, Bry and Torge [1998] use splitting and the δ * -rule to generate domain minimal models. In the present research we have evaluated the combination of blocking techniques (introduced later in Section 6) with the classical range-restricting transformation crr. This has shown promising empirical results as presented in Section 9.
Next, we introduce a new transformation to range-restricted form. Instead of enumerating the generally infinite Herbrand universe in a bottom-up fashion, the intuition is that it generates terms only as needed.
The transformation involves extracting the non-variable top-level terms in an atom. Let P (t 1 , . . . , t n ) be an atom and suppose x 1 , . . . , x n are fresh variables. For all i ∈ {1, . . . , n} let s i = t i , if t i is a variable, and s i = x i , otherwise. The atom P (s 1 , . . . , s n ) is called the term abstraction of P (t 1 , . . . , t n ). Let the abstraction substitution α be defined by Hence, P (s 1 , . . . , s n )α = P (t 1 , . . . , t n ), that is, α reverts the term abstraction.
The new range-restricting transformation, denoted by rr, of a clause set M is the clause set obtained by carrying out the following steps (explanations and an example are given afterwards): (1) Add a constant. Same as Step (1) in the definition of crr.
(2) Domain elements from clause bodies. For each clause H ← B in M and each atom P (t 1 , . . . , t n ) from B, let P (s 1 , . . . , s n ) be the term abstraction of P (t 1 , . . . , t n ) and let α be the corresponding abstraction substitution. Extend rr(M ) by the set (3) Range-restriction. Same as Step (2) in the definition of crr.
(4) Domain elements from Σ P . For each n-ary P in Σ p , extend rr(M ) by the set (5) Domain elements from Σ f . For each n-ary f in Σ f , extend rr(M ) by the set The intuition of the transformation reveals itself if we think of what happens when using hyperresolution. The idea is again to build model(s) during the derivation, but this time terms are added to the domain only as necessary.
Steps (1) and (3) are the same as Steps (1) and (2) in the definition of crr. The clauses added in Step (2) cause functional terms that occur negatively in the clauses to be inserted into the domain.
Step (4) ensures that positively occurring functional terms are added to the domain, and Step (5) ensures that the domain is closed under subterms.
To illustrate the steps of the transformation consider the following clause.

For
Step (2) the term abstraction of the body literal of clause ( †) is p(x 1 , x 2 , x) and the abstraction substitution is α = {x 1 → a, x 2 → f(x, y)}. The clauses added in Step (2) are the following: Notice that among the four clauses we have so far the clauses ( †) and ( ‡) are not range-restricted. They are however replaced by range-restricted clauses in Step (3), namely: Step (4) generates clauses responsible for inserting the terms that occur in the heads of clauses into the domain. That is, for each i ∈ {1, 2, 3} and each j ∈ {1, 2} these clauses are added.
For instance, when a model assigns true to the instance q(a, g(a, f(a, a))) of one of the head atoms of the clause ( † †), then dom(a) and dom(g(a, f(a, a))) are also true. It is not necessary to insert the terms of the instance of the other head atom into the domain. The reason is that it does not matter how these (extra) terms are evaluated, or whether the atom is evaluated to true or false in order to satisfy the disjunction. The clauses added in Step (4) alone are not sufficient, however. For each term in the domain all its subterms have to be in the domain, too. This is achieved with the clauses obtained in Step (5). That is, for each j ∈ {1, 2} these clauses are added.
For the purposes of model generation, it is important to note that one particular type of clause in the rr transformation should not be treated as a normal clause. For the equality predicate, Step (4) produces the clauses Most theorem provers simplify these clauses to dom(x). As a consequence this can lead to all negative domain literals being resolved away and all clauses containing a positive domain literal to be subsumed. This means range-restriction is undone. This is what happens in SPASS. Since Step (4) clauses really only need to be added for positively occurring predicate symbols an easy solution involves replacing any positive occurrence of the equality predicate by a predicate symbol myequal (say), which is fresh in the signature, and adding the clauses in Step (4) rather than (#). In addition, the clause set needs to be extended by this definition of myequal.
x ≈ y ← myequal(x, y) This solution has the intended effect of adding terms occurring in positive equality literals to the domain, and prevents other inferences or reductions on myequal. It is not difficult to prove that E-satisfiability is preserved in both directions. We will implicitly use this fact in the proofs below.
Proposition 1 (Completeness of range-restriction). Let M be any clause set. If rr(M ) is satisfiable then M is satisfiable.
Proof. Suppose rr(M ) is satisfiable. Let I rr be a Herbrand model of rr(M ). We define a quasi-Herbrand interpretation I and show that it is a model of M . First, the domain of I is defined as the set |I| = {t | I rr |= dom(t)}. Now, to define a total interpretation for the function symbols, we map each n-ary function symbol f in Σ f to the function f I : |I| × · · · × |I| → |I|, where, for all d 1 , . . . , d n ∈ |I|, Here, the constant c is the one mentioned in Step (1) of the transformation. (It is clear that |I| contains c.) Notice that due to Step (5) the domain |I| must contain for each term all its subterms. An easy consequence is that all terms in |I| are evaluated as themselves, exactly as in Herbrand interpretations. Each other (ground) term is evaluated as some other term from |I|.
, since g(c) ∈ |I| and by the definition of g I . We see that f is indeed mapped to a total function over the domain |I|, as required.
Regarding the interpretation of the predicate symbols in I, define for every n-ary predicate symbol P in Σ P and for all d 1 , . . . , d n ∈ |I|: (1) That is, the interpretation of the predicate symbols in I is the same as in I rr under the restriction of the domain to |I| ⊆ |I rr |. It remains to show that I is a model of M . It suffices to pick a clause H ← B from M arbitrarily and show that I satisfies this clause. We do this by assuming that I does not satisfy H ← B and deriving a contradiction.
That I does not satisfy H ← B means there is a valuation µ such that (I, µ) |= B but (I, µ) |= H. As usual, a valuation is a (total) mapping from the variables to the domain under consideration.
Because the domain |I| consists of (ground) terms, the valuation µ can be seen as a substitution. Thus, Bµ is a set of ground atoms, and Bµ ⊆ I may or may not hold. We show next that if (I, µ) |= B, as given, then Bµ ⊆ I. In other words, the body is satisfied in I because |I| contains all body atoms Bµ, but not for the reason that I assigns true to some body atom B with some argument term evaluated to c, and that atom being contained in I. An example for the latter case is |I| = {c}, B = P(x), I = {P(c)} and µ = {x → a}. Although we have (I, µ) |= P (x), in essence because a I = c, it does not hold that P(a) ∈ I. The relevance of this result is that it allows syntactically based reasoning further below to show that I is a model of M .
Thus, let us show I(t i µ) = t i µ. By the definition of the interpretation function · I it is enough to show t i µ ∈ |I| (as said above, terms from |I| are evaluated to themselves). If t i is a variable then t i µ ∈ |I| follows from the fact that µ was chosen as a substitution into |I|. Assume now that t i is not a variable and let P (s 1 , . . . , s n ) be the term abstraction of P (t 1 , . . . , t n ) and α its abstraction substitution. By transformation Step (2), rr(M ) includes the clause where {x i → t i } ∈ α. By the definition of an abstraction, for all j ∈ {1, . . . , n}, s j is a fresh variable whenever t j is not a variable.
Recall from above that P (I(t 1 µ), . . . , I(t n µ)) ∈ I rr . We are going to show now that with clause (2) this entails dom(t i µ). By the construction of |I| this suffices to prove t i µ ∈ |I|, as desired.
Consider the substitution It agrees with µ (in particular) when t j is a variable and otherwise maps the variable x j to I(t j µ).
When t j is a variable then let s j = t j be the definition of an abstraction. This means s j µ ′ = s j µ = t j µ = I(t j µ) (the latter identity holds, again, because µ is a substitution into |I| and elements from |I| evaluate to themselves). When t j is not a variable then s j is the variable x j . By construction of µ ′ we have Applying the substitution µ ′ to the clause (2) yields With the identities s j µ ′ = I(t j µ), the identities dom(x i )αµ ′ = dom(t i )µ ′ and the fact that P (I(t 1 µ), . . . , I(t n µ)) ∈ I rr it follows that dom(t i )µ ′ ∈ I rr . The substitution µ and µ ′ differ in their domains only on the fresh variables x 1 , . . . , x n . Therefore dom(t i )µ ′ = dom(t i )µ and dom(t i )µ ∈ I rr follows, as desired. This was the last subgoal to be proven to establish P (t 1 , . . . , t n )µ ∈ I, which, in turn, remained to be shown to complete the proof that Bµ ⊆ I. The next step in the proof is to show that the clause body of the clause in rr(M ) corresponding to H ← B is satisfied by I rr . That clause is the rangerestricted version of the clause H ← B in M . According to Step (3) of the transformation it has the form for some variables x 1 , . . . , x k ; those occurring in H but not in B.
From Bµ ⊆ I as derived above and equivalence (1) it follows that Bµ ⊆ I rr . Recall that µ is a valuation mapping into the domain |I|. Reading it as a substitution gives x j µ ∈ |I|, for all j ∈ {1, . . . , k}. From the construction of |I| it follows that dom(x j µ) ∈ I rr . Together with Bµ ⊆ I rr and the fact that I rr is a model of rr(M ), and hence of clause (3), it follows that I rr satisfies Hµ. This means Aµ ∈ I rr for some head atom A in H.
The atom A is of the form Q(s 1 , . . . , s m ) for some m-ary predicate symbol Q and terms s 1 , . . . , s m . By Step (4) of the transformation, rr(M ) includes, for all i ∈ {1, . . . , m} the clause Again by reading µ as a substitution, because I rr is a model of rr(M ), and hence of clause (4), and by the identities Q(s 1 µ, . . . , s m µ) = Q(s 1 , . . . , s m )µ = Aµ ∈ I rr we conclude dom(s i µ) ∈ I rr , for all i ∈ {1, . . . , m}. By construction of |I| we have that s i µ ∈ |I|. By equivalence (1) it follows that Q(s 1 µ, . . . , s m µ) ∈ I.
Recall that Q(s 1 , . . . , s m ) is a head atom of the clause (3) and hence a head atom of the clause H ← B. Further recall that s i µ ∈ |I| entails that s i µ is evaluated to itself in I. Together with Q(s 1 µ, . . . , s m µ) ∈ I this means (I, µ) |= Q(s 1 , . . . , s m ). This is a contradiction to (I, µ) |= H as concluded above. The proof is complete.
The proof actually gives a characterization of the models associated with a satisfiable clause set rr(M ). (2) and (3), which are the only ones that apply directly to the given clauses, have no effect on the equality ax- Adding back the reflexivity axiom trivially preserves unsatisfiability, that is, with rr(M ∪ EAX(Σ P ∪ Σ f )) being unsatisfiable, so is The clause x ≈ x ← dom(x) can be deleted because it is subsumed by the clause is unsatisfiable, and so rr(M ) is E-unsatisfiable.
We emphasize that we do not propose to actually use the equality axioms in conjunction with a theorem prover (though they can of course). They serve merely as a theoretical tool to prove completeness of the transformation.   By carefully modifying the definition of rr and at the expense of some duplication it is possible to compute the reduction in linear time.
Proposition 2 (iii) confirms that every clauses produced by the rr transformation is range-restricted. Let us consider another example to get a better understanding of the rr transformation.
( * ) Applying Steps (2) and (3) of the rr transformation gives us the clause This clause is splittable into The first split component clause is an example of an 'enumerate the Herbrand universe' clause from the crr transformation ( Step (2) in the definition of crr). Such clauses are unpleasant because they cause the entire Herbrand universe to be enumerated with BUMG approaches. Before describing a solution let us analyze the problem further. The main rationale of our rr transformation is to constrain the generation of domain elements and limit the number of inference steps. The general form of clauses produced by Step (2), followed by Step (3), is the following, where y ⊆ x, x ⊆ y ∪ z and u ⊆ z.
Clauses of the first form are often splittable (as in the example above), and can produce clauses of the unwanted form Suppose therefore that splitting of any clause is forbidden when this splits the negative part of the clause (neither SPASS nor a hypertableaux prover do this anyway). Although the two types of clauses above both do reduce the number of terms created, compared to the classical range-restricting transformation, the constraining effect of the first type of clauses is slightly limited. Terms f (s) are not generated, only when no fact P (t) is present or has been derived. When a clause P (t) is present, or as soon as such a clause is derived (for any ground terms t), then terms are freely generated from terms already in the domain with f as the top symbol.
Here is an example of a clause set for which the derivation is infinite on the rr transformation. (The example is an extension of the example above with the clause p(b) ← ⊤.) Notice the derivation is infinite on the classical range-restricting transformation as well, due to the generated clauses The second type of clauses, dom(f (u)) ← P (z), are less problematic. Here is a concrete example. For ⊥ ← r(x, f(x)), Step (2) produces the clause Although this clause, and the general form, still causes larger terms to be built with hyperresolution type inferences, the constraining effect is larger.
In the next two sections we discuss ways of improving range-restricting transformations further.

Shifting Transformation
The clauses introduced in Step (2) of the new rr transformation to rangerestricted form use abstraction and insert (possibly a large number of) instantiations of terms occurring in the clause bodies into the domain. These are sometimes unnecessary and can lead to non-termination of BUMG procedures.
The shifting transformation introduced next can address this problem. It consists of two sub-transformations, basic shifting and partial flattening.
If A is an atom P (t 1 , . . . , t n ) then let not A denote the atom not P (t 1 , . . . , t n ), where not P is a fresh predicate symbol which is uniquely associated with the predicate symbol P . If P is the equality symbol ≈ we write not P as ≈ and use infix notation. Now, the basic shifting transformation of a clause set M is the clause set bs(M ) obtained from M by carrying out the following steps.
Each of the atoms B 1 , . . . , B m is called a shifted atom.
(2) Shifted atom consistency. Extend bs(M ) by the clause set P is the n-ary predicate symbol of a shifted atom}.
Notice that we do not add clauses complementary to the 'shifted atoms consistency' clauses, that is, P (x 1 , . . . , x n ) ∨ not P (x 1 , . . . , x n ) ← ⊤. They could be included but are superfluous.
Let us continue the example given at the end of the previous section. We can use basic shifting to move negative occurrences of functional terms into heads. In the example, clause ( * ) is replaced by Even in the presence of an additional clause, say, q(x) ← ⊤, which leads to the clauses termination of BUMG can be achieved. For instance, in a hyperresolution-like mode of operation and with splitting enabled, the SPASS prover [Weidenbach et al., 2007[Weidenbach et al., , 2009] splits the derived clause r(a) ∨ not p(f(a)), considers the case with the smaller literal r(a) first and terminates with a model . This is because a finite completion (model) is found without considering the case of the bigger literal not p(f(a)), which would have added the deeper term f(a) to the domain. The same behaviour can be achieved, for example, with the KRHyper BUMG prover, a hypertableaux theorem prover [Wernhard, 2003].
As can be seen in the example, the basic shifting transformation trades the generation of new domain elements for a smaller clause body by removing literals from it. Of course, a smaller clause body affects the search space, as then the clause can be used as a premise more often. To (partially) avoid this effect, we propose an additional transformation to be performed prior to the basic shifting transformation.
For a clause set M , the partial flattening transformation is the clause set pf(M ) obtained by applying the following steps. (1) Reflexivity. Extend pf(M ) by the unit clause x ≈ x ← ⊤.
(2) Partial flattening. For each clause H ← B in pf(M ), let t 1 , . . . , t n be all top-level terms occurring in the non-equational literals in the body B that are proper functional terms, for some n ≥ 0. Let x 1 , ..., x n be fresh variables. Replace the clause H ← B[t 1 , . . . , t n ] by the clause It should be noted that the equality symbol ≈ need not be interpreted as equality, but could. (Un-)satisfiability (and logical equivalence) is preserved even when reading it just as 'unifiability'. This can be achieved by the clause x ≈ x ← ⊤. One should however note that the reflexivity clause is not compatible with introducing the myequal predicate, so this might not always be a possibility. (In our implementation, for this reason the reflexivity clause is not added.) In our running example, applying the transformations pf, bs and rr, in this order, yields the following clauses (among other clauses, which are omitted because they are not relevant to the current discussion).
Observe that the first clause is more restricted than the clause ( * * ) above because of the additional body literal p(u).
The reason for not extracting constants during partial flattening is that adding them to the domain does not cause non-termination of BUMG methods. It is preferable to leave them in place in the body literals because they have a stronger constraining effect than the variables introduced otherwise.
Extracting top-level terms from equations has no effect at all. Consider the unit clause ⊥ ← f (a) ≈ b, and its partial flattening Applying basic shifting yields f (a) ≈ x ← x ≈ b, and, hyperresolution with x ≈ x ← ⊤ gives f (a) ≈ b ← ⊤. This is the same result as obtained by the transformations as defined. This explains why top-level terms of equational literals are excluded from the definition. (One could consider using 'standard' flattening, that is, recursively extracting terms, but this does not lead to any improvements over the defined transformations.) Finally, we combine basic shifting and partial flattening to give the shifting transformation, formally defined by sh := pf • bs, that is, sh(M ) = bs(pf(M )), for any clause set M .
Proposition 3 (Completeness of shifting). Let M be any clause set. If sh(M ) is satisfiable then M is satisfiable.
Proof. Not difficult, since bs (basic shifting) can be seen to be a structural transformation and pf (partial flatting) is a form of term abstraction.
Corollary 2 (Completeness of shifting wrt. E-interpretations). Let M be any clause set. If sh(M ) is E-satisfiable then M is E-satisfiable.
Proof. Using the same line of argument as in the proof of Corollary 1, proving preservation of E-satisfiability can be reduced to proving preservation of satisfiability by means of the equality axioms (observe that the shifting transformation does not modify the equality axioms).

Blocking
The final transformation introduced in this paper is called blocking and provides a mechanism for detecting recurrence in the derived models. The blocking transformation is designed to realize a 'loop check' for the construction of a domain, by capitalizing on available, powerful equality reasoning technology and redundancy criteria from saturation-based theorem proving. To be suitable, a resolution-based prover, for instance, should support hyperresolution-style inference, strong equality inference (for example, superposition or ordered rewriting), splitting, and the possibility to search for split-off equations first and standard redundancy elimination techniques.
The basic idea behind blocking is to add clauses that cause a case analysis of the form s ≈ t versus s ≈ t, for (ground) terms s and t. Although such a case analysis obviously leads to a bigger search space, it provides a powerful technique to detect finite models with a BUMG prover. This is because in the case that s ≈ t is assumed, this new equation may lead to rewriting of otherwise infinitely many terms into one single term. To make this possible, the prover must support the above features, including notably splitting. Among resolution theorem provers splitting has become standard. Splitting was first available in the saturation-based prover SPASS [Weidenbach et al., 2007[Weidenbach et al., , 2009, but is now also part of VAMPIRE [Riazanov and Voronkov, 2002] and E [Schulz, 2013]. Splitting is an integral part of the hypertableau prover E-KRHyper [Baumgartner et al., 2007, Pelzer andWernhard, 2007].
Blocking has the same goal as the unsound theorem proving technique introduced first in Lynch [2004]. Instances of unsound theorem proving exemplified in Lynch [2004] include replacing a clause by one that subsumes it, and by adding equations for joining equivalence classes in the abstract congruence closure framework. Unsound theorem proving has been incorporated later in DPLLT-based theorem proving Bonacina et al. [2011].
In the following we introduce four different, but closely related, blocking transformations, called subterm domain blocking, subterm predicate blocking, unrestricted domain blocking and unrestricted predicate blocking. Subterm domain blocking was introduced in the short version of this paper under the name blocking [Baumgartner and Schmidt, 2006]. Subterm predicate blocking is inspired by and related to the blocking technique described in . Unrestricted domain blocking is the first-order version of the unrestricted blocking rule introduced in Schmidt and Tishkovsky [2007] and used for developing terminating tableau calculi for logics with the effective finite model property in Tishkovsky [2008, 2011].

Subterm Domain Blocking
By definition, the subterm domain blocking transformation of a clause set M is the clause set sdb(M ) obtained from M by carrying out the following steps. (1) Axioms describing the subterm relationship. Let sub be a 'fresh' binary predicate symbol not in Σ P . Extend sdb(M ) by and, for every n-ary function symbol f ∈ Σ f and all i ∈ {1, . . . , n}, add the clauses (2) Subterm equality case analysis. Extend sdb(M ) by these clauses. x The subterm domain blocking transformation allows to contemplate whether two domain elements that are in a subterm relationship should be identified and merged, or not. This blocking transformation preserves range-restrictedness. In fact, because the dom predicate symbol is mentioned in the definition, the blocking transformation can be applied meaningfully only in combination with range-restricting transformations.
Reading sub(s, t) as 's is a subterm of t', Step (1) in the blocking transformation might seem overly involved, because an apparently simpler specification of the subterm relationship for the terms of the signature Σ f can be given. Namely: for every n-ary function symbol f ∈ Σ f and all i ∈ {1, . . . , n}. This clause set is range-restricted. Yet, this specification is not suitable for our purposes. The problem is that the second clause introduces proper functional terms. For example, for a given constant a and a unary function symbol f, when just dom(a) alone has been derived, a BUMG procedure derives an infinite sequence of clauses: sub(a, a), sub(a, f(a)), sub(a, f(f(a))), . . . .

This does not happen with the specification in
Step (1). It ensures that conclusions of BUMG inferences involving sub are about terms currently in the domain, and the domain is always finite.
To justify the clauses added in Step (2) we continue this example and suppose an interpretation that contains dom(a) and dom(f(a)). These might have been derived earlier in the run of a BUMG prover. Then, from the clauses added by blocking, the (necessarily ground) disjunction Now, it is important to use a BUMG prover with support for splitting and to equip it with an appropriate search strategy. In particular, when deriving a disjunction such as the one above, the ≈-literal should be split off and the clause set obtained in this case should be searched first . The reason is that the (ground) equation f(a) ≈ a thereby obtained can then be used for simplification and redundancy testing purposes. For example, should dom(f(f(a))) be derivable now (in the current branch), then any prover based a modern, saturation-based theory of equality reasoning is able to prove it redundant from f(a) ≈ a and dom(a). Consequently, the domain is not be extended explicitly. The information that dom(f(f(a))) is in the domain is however implicit via the theory of equality.

Subterm Predicate Blocking
Subterm domain blocking defined in the previous section applies blocking to domain terms where one is a proper subterm of the other. The idea of the subterm (unary) predicate blocking transformation is similar, but it merges only the (sub)terms in the extension of unary predicate symbols different to dom in the current interpretation.
Subterm predicate blocking is defined as follows: (1) Axioms describing the subterm relationship. Same as Step (1) in the definition of sdb.
(2) Subterm equality case analysis. Extend spb(M ) by these clauses, for each unary predicate symbol p ∈ Σ P . (Recall that Σ P does not contain dom.) Finally, add the clause Observe that the only difference between this transformation and the subterm domain blocking transformation lies in Step (2). The clauses x ≈ y ∨ x ≈ y ← sub(x, y) ∧ p(x) ∧ p(y) added here are obviously more restrictive than their counterpart x ≈ y ∨ x ≈ y ← sub(x, y) in the definition of the subterm domain blocking transformation sdb.
That subterm predicate blocking is strictly more restrictive can be seen from the following example, which also helps to explain the rationale behind this transformation.
Notice that the subterm predicate blocking transformation includes the clauses These are however only applicable for sub(s, s), p(s) and q(s) which only lead to redundant BUMG inferences. The motivation behind these clauses is to block two p-literals (say) only when there are two literals p(s) and p(t) where s is a subterm of t. Conversely, if no such loop comes up, as in the example above, there is no reason for blocking. By contrast, the subterm domain blocking transformation sdb with its clause x ≈ y∨x ≈ y ← sub(x, y) would be applicable even for distinct terms, leading to the (unnecessary) split into the cases a ≈ f(a) and a ≈ f(a). From a more general perspective, the spb transformation is motivated by the application to description logic knowledge bases Schmidt, 1999, Baader andSattler, 2001]. Often, such knowledge bases do not contain cyclic definitions, or only few definitions are cyclic. The subterm predicate transformation aims to apply blocking only to concepts (unary predicates) with cyclic definitions. Below, in Section 6.5, we discuss a description logic example to highlight the differences between the various blocking transformations.

Unrestricted Domain Blocking
The two previous 'subterm' variants of the blocking transformation allow to speculatively identify terms and their subterms. The 'unrestricted' variants introduced next differ from both by allowing speculative identifications of any two terms.
For the 'domain' variant, called unrestricted domain blocking transformation, the definition is as follows. (1) Domain elements equality case analysis. Extend udb(M ) by these clauses.
There is a clear trade-off between this transformation and the subterm domain blocking transformation sdb. On the one hand, the unrestricted domain blocking transformation induces a larger search space, as the bodies of the clauses x ≈ y ∨ x ≈ y ← dom(x) ∧ dom(y) are less constrained than their counterparts in the subterm domain blocking transformation. This becomes obvious after extending the clause body of x ≈ y ∨ x ≈ y ← sub(x, y) from the sdb transformation with dom(x) ∧ dom(y), which does not change anything. On the other hand, the unrestricted domain blocking transformation enables the finding of models with smaller domains. This means fewer congruence classes on the Herbrand terms are induced by the equality relation ≈. As our experiments show, such models can often be found quicker in satisfiable problems, even for the crr transformation.
Using the ideas of the termination proof in  for semantic ground tableau with unrestricted domain blocking for description logics with the expressive power similar to the two-variable fragment of firstorder logic, it can be shown BUMG with unrestricted domain blocking can return finite models, if they exist, even for problems of undecidable fragments. Carrying over also the results in Schmidt and Tishkovsky [2008] implies unrestricted domain blocking can be used in BUMG methods to return domain minimal models for logics with the effective finite model property.

Unrestricted Predicate Blocking
The definition of the last variant of blocking, the unrestricted (unary) predicate blocking transformation, is as follows. (1) Term equality case analysis. Extend upb(M ) by these clauses, for each unary predicate symbol p ∈ Σ P .
x ≈ y ∨ x ≈ y ← p(x) ∧ p(y) Finally, add the clause This transformation allows to equate any two (distinct) terms in a p-relation, if there are any. The motivation is a combination of the above, to block cycles on p-literals if they arise, and to compute models with small domains.

Comparison on an Example
It is instructive to compare the effects on the returned models of the four blocking transformations on an example from description logics. To this end, consider the description logic knowledge base (left) and its translation into clause logic (right) in Table 1. Notice that the cycle in the inclusion statements in the TBox (for p 1 and p 2 ) means some form of blocking is needed for decidability in tableaubased description logic systems. Likewise, blocking is needed to force BUMG methods to terminate on the translated clause form. Any of the four blocking transformations defined above suffice. Table 2 summarizes the behaviour of these transformations, in terms of interesting relations in the computed model.
When comparing in detail the blocking techniques developed for description logics it becomes clear that the transformations rr • τ and sh • rr • τ , for τ ∈ {sdb, spb, udb, upb}, when applied to a knowledge base with the finite model property, in conjunction with a suitable BUMG method (see above), can be refined to simulate various forms of standard blocking techniques used in description logic systems, including subset ancestor blocking and equality ancestor blocking, cf. ,  and Khodadadi et al. [2013]. Because standard loop checking mechanisms used in description logic systems do not require backtracking, appropriate search strategies and restrictions for performing inferences and applying blocking need to be used.
An advantage of our approach to blocking as opposed to blocking without equality reasoning used in mainstream description logic systems [Baader and Sattler, 2001] is that it applies to any first-order clause set, not only to clauses from the translation of description logic problems. This makes the approach very general and widely applicable.
For instance, our approach makes it possible to extend description logics with arbitrary (first-order expressible) 'rule' languages. 'Rules' provide a connection to (deductive) databases and are being used to represent information that is currently not expressible in the description logics associated with OWL DL. The specification of many natural properties of binary relations and complex statements involving binary relations are outside the scope of most current description logic systems. An example is the statement: individuals who live and work at the same location are home workers. This can be expressed as a Horn rule (clause) homeWorker(x) ← work(x, y) ∧ live(x, z) ∧ loc(y, w) ∧ loc(z, w), but, with some exceptions , Weidenbach et al., 2007, is not expressible in current description logic systems.

Soundness and Completeness of the Transformations
Each of the blocking transformations is complete: The converse, that is, soundness of the transformation, is easy to prove. One basically needs to observe that the clauses added in respectively Steps (2) and (1) of the blocking transformations, realize a case distinction over whether two terms are equal or not. Trivially, one of the two cases always holds.
Putting all the transformations and the corresponding results together we can state the main theoretical result of the paper. (ii) tr(M ) can be computed in quadratic time.
The reverse directions of (iii), that is, soundness of the respective transformations, hold as well. The proofs are either easy or completely standard.
By carefully modifying the definition of rr it is possible to compute the reductions in linear time.

Decidability of BS classes
The Bernays-Schönfinkel class can be decided using transformations into rangerestricted clauses. Formulae in the Bernays-Schönfinkel class are conjunctions of function-free and equality-free formulae of the form ∃ * ∀ * ψ, where ψ is free of quantifiers. A clause is a BS clause iff all functional terms occurring in it are constants.
It is proved in Schmidt and Hustadt [2005] that hyperresolution and any refinements decide the class of range-restricted BS clauses without equality. Here assume that the language includes equality.
Theorem 2. The class of range-restricted BS clauses (with equality), is decidable by hyperresolution (and paramodulation) and all refinements.
This means all refinements of hyperresolution (and some form of equality reasoning) combined with any translation into range-restricted clauses is a decision procedure for the BS class. Therefore: (i) Hyperresolution and all refinements decide tr(M ).
(ii) All BUMG methods decide M .
Since there are linear transformations of first-order formulae into clausal form, and since all the tr transformations are effective reductions of first-order clauses into range-restricted clauses, we obtain the following result.
(i) There is a quadratic (linear), satisfiability equivalence preserving transformation of any formula in the Bernays-Schönfinkel class, and any set of BS clauses, into a set of range-restricted BS clauses.
(ii) All procedures based on hyperresolution or BUMG decide the class of BS formulae and the class of BS clauses.
In Schmidt and Hustadt [2005] a similar but different transformation is used to prove this result for hyperresolution and BS without equality. In fact, what is crucial for deciding the BS class is a grounding method. This can be achieved by any form of range-restriction and hyperresolution-like inferences. Theorem 3.(ii) can therefore be strengthened to include also any instantiation-based method, in particular also methods using on-the-fly instantiation such as semantic Smullyan-type tableaux.

Experimental Evaluation
We have implemented the transformations described in the previous sections and carried out experiments on problems from the TPTP library, Version 6.0.0.
The implementation, in SWI-Prolog, is called Yarralumla (Yet another rangerestriction avoiding loops under much less assumptions). Since the transformations introduced in this paper are defined for clausal problems we have selected for the experiments all the CNF problems from the TPTP suite.
In our initial research [Baumgartner and Schmidt, 2006] we used Yarralumla with the MSPASS theorem prover, Version 2.0g.1.4 [Hustadt and Schmidt, 2000]. As the extra features of MSPASS have in the mean time been integrated into the SPASS theorem prover [Weidenbach et al., 2007] and SPASS has significantly evolved since Version 2.0, for the present paper we combined Yarralumla with SPASS Version 3.8d as a BUMG system.
For that purpose we modified the code of SPASS in a number of ways. We added one new flag to activate splitting on positive ground equality literals in positive non-Horn clauses. The main inference loop was updapted so that finding a splitting clause and applying splitting has highest priority (unchanged) followed immediately by picking a non-positive blocking clause, that is, clauses of the form s ≈ t ∨ H 1 ∨ · · · ∨ H m ← B 1 ∧ · · · ∧ B k for m ≥ 0 and k > 0, and performing inferences with it. The selection of splitting clauses was adapted so that positive blocking clauses are always selected, when there are any. Moreover, the first equality literal is split upon. Positive blocking clauses are ground clauses of the form s ≈ t ∨ H 1 ∨ · · · ∨ H m , where m ≥ 1. This adaptation ensures blocking is performed eagerly to keep the set of ground terms small. The tests with Yarralumla were performed using ordered resolution and superposition with selection of at least one negative literal, forward and backward rewriting, unlimited splitting and matching replacement resolution, subsumption deletion and various other simplification rules. This means the inferences are performed in an ordered hyperresolution-style with eager splitting and forward and backward ground rewriting. The derivations constructed are thus BUMG tree derivations, the proofs produced are BUMG refutation proofs, and the models returned are BUMG models.
We also tested SPASS Version 3.8d in auto mode on the sample. In auto mode SPASS used ordered resolution with dynamic selection. SPASS automatically turned off splitting for non-Horn clauses. Dynamic selection means typically literals were only selected if multiple maximal literals occur in a clause. This means the behaviour of SPASS in auto mode was very different to that of SPASS-Yarralumla, which always selected a literal in clauses with non-empty negative part. The changes to SPASS in SPASS-Yarralumla meant that splitting was performed eagerly and blocking clauses were targeted, which was not the case with SPASS in auto mode. We tested SPASS in auto mode only on the original files (translated from TPTP syntax to SPASS syntax).
The experiments were run on a cluster of 128 Dell PowerEdge M610 Blade Servers each with two Intel Xeon E5620 2.4 GHz processors and 48 GiB main memory each. The time limit was ten minutes (CPU time).

Results
Tables 3 and 4 summarize the results for satisfiable clausal problems in the TPTP library, measuring the number of problems solved with in the time limit. The columns with the heading '#' give the number of problems in the TPTP Table 3: Number of problems solved on satisfiable problems, by TPTP categories.
10 2 6 4 4 2 2 4 4 4 2 2 3 4 2 2 2 3 4 3 2 2 27 7 8 8 7 7 7 8 8 7 7 7 8 8 7 7 7 8 8 7 7 7 REL 1 -1 1 ---1 1 ---1 1 ---1 1 ---RNG 14 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 3 categories and the different TPTP rating ranges. The subsequent columns give the number of problems solved within the time limit. The results are presented for the different BUMG methods that were used. For example, sh•rr•sdb refers to the method based on the transformation defined by the new range-restriction transformation, shifting and subterm domain blocking. To evaluate the effect of the different forms of blocking the results are grouped into groups of five: no blocking, subterm domain blocking (sdb), unrestricted domain blocking (udb), subterm predicate blocking (spb) and unrestricted predicate blocking (spb). In each group the first column provides the baseline for that group. The last column with the heading 'auto' gives the results for runs of SPASS Version 3.8d in auto mode on the original input files. The runtimes for the problems solved spanned the whole range, from less than one second to all of the time allowed. The best results in each group in each row are highlighted in bold font. The underlined values are the best results for all methods including SPASS in auto mode. As expected the worst results in each group were obtained for the baseline transformations without blocking. This confirms the expectation that blocking is an essential technique for BUMG methods. Among the different blocking techniques the best results were obtained with unrestricted domain blocking in all four groups. Overall, the best result was obtained for the combination with rr and shifting, i.e., sh • rr • udb, solving 6.0% more problems than the second best method, crr•udb using the classical range-restriction transformation without shifting, and nearly 11% more problems than the transformations rr • udb and sh • crr • udb. This means shifting had a significant positive effect in combination with the new range-restriction transformation, but less so in combination with classical range-restriction. The positive effect of shifting could also be seen for the number of problems solved without blocking for rr and sh•rr (34% improvement).
The good results for crr • udb show the value of classical range-restriction. In the LAT category, crr • udb solved 32 problems, whereas sh • rr • udb solved only 5 problems. This seems to indicate there was a trade-off between using the crr transformation and the rr transformation in combination with shifting, but also showed the virtues of unrestricted domain blocking as a universal technique for BUMG. SPASS in auto mode fared very well in the SWV category, where 79 problems were solved compared to 7-8 problems for the best BUMG methods. Overall SPASS in auto mode solved 9% fewer problems than the best BUMG method sh • rr • udb.
Looking at the top half of Table 4 (up to difficulty rating of 0.40), the BUMG method based on sh • rr • udb fared best, but for problems more difficult (up to a rating of 0.70) the performance deteriorated and the method crr • udb solved the highest number of problems. For problems with ratings higher than 0.70 SPASS in auto mode solved significantly more problems than the BUMG methods. One problem with rating 1.00 was solved by the crr • udb method (namely, GRP741-1 in 121.86 seconds). Problems in the TPTP library with rating 1. 00 have not yet been solved by any other prover. Table 5 presents an evaluation of the different blocking techniques, listing the number of problems lost and the number of problems gained against the baseline methods in each group. The results confirm the significant positive effect of unrestricted domain blocking for satisfiable problems.
Analysis of the gain and loss of the method based on sh • rr • udb against the other methods gave these results: Against rr • udb 66 problems were gained and Table 4: Number of problems solved on satisfiable problems, by TPTP problem rating.   20 problems lost; against sh • crr • udb the gain/loss was +90/-45 and against crr • udb it was +85/-59. This non-uniformity suggests each variation of rangerestriction had the potential to solve some problems not solvable within the time limit by sh • rr with unrestricted blocking. The biggest variation was against SPASS in auto mode, where 169 problems were gained and 130 problems were lost. Table 6 displays how many problems were uniquely solved. The first row lists how many problems were uniquely solved over all methods including SPASS in auto mode. Although two of the BUMG methods with unrestricted domain blocking fared better than SPASS in auto mode, the latter solved a significant number of problems that none of the BUMG could solve (namely, 115 problems, or 27.5% of the problems solved by SPASS in auto mode, or 10.2% of all satisfiable problems). This reflected the orthogonality of the underlying methods. Analogously, the relatively low number of problems uniquely solved by the BUMG methods (21 problems, i.e., 1.9% of all satisfiable problems), which is also apparent from the number of problems solved uniquely among the BUMG methods in the second row (25 or 2.2% of all satisfiable problems) can be attributed to the similarity of the underlying methods. An analysis of the number of uniquely solved problems per group of BUMG methods in the third row of the table highlighted the importance of unrestricted domain blocking. While overall no problems were only solved with unary predicate blocking techniques, within the groups there were four problems solved only with unary predicate blocking. Table 7 gives an impression of the increase in the size of the input files caused by the transformations. Although the file sizes were measured after all comments and white space were removed, variations is name lengths distort the values slightly (which can be seen in the values for shifting). The results therefore need to be interpreted cautiously. The average increase in file size does show a significant effect on the size of the problem for the new range-restriction transformations and also subterm blocking (both subterm domain blocking and subterm predicate blocking). The largest increase in size was observed for the problem SYO600-1 (13.7 fold increase), which contained 380 predicate symbols with arity up to 64, 2 constants and no non-constant function symbols. The main cause for this increase was the large number of clauses added in Step (4)    (4). This is a large number. In contrast for the crr transformations the increase in size was negligible, and also, generally, it was significantly lower. Despite its positive virtues this shows a downside of the rr transformation. For problems containing a large number of function symbols with high arity, Step (5) similarly adds many clauses, even though the transformation overall is still effective.
Analysis of the problems solved without any form of blocking revealed a large number belonged to the Bernays-Schönfinkel class: 131/176 (74%) for rr, 132/236 (56%) for sh • rr, 133/140 (95%) for crr, and 134/142 (94%) for sh • rr. These results confirmed the expectation that more problems not solvable with the crr transformation can be solved with the rr transformation and the benefits of reducing the number of terms created.
Although the main purpose of BUMG methods is disproving theorems and generating models for satisfiable problems, for completeness we report in Tables 8 and 9 the results for unsatisfiable clausal TPTP problems. The results were not as uniform as for satisfiable problems. However some general observations can be made. SPASS in auto mode fared best overall, and did so in all TPTP categories and each problem rating category. For unsatisfiable problems the drawback of BUMG methods is that clauses need to be exhaustively grounded and each branch in the derivation tree needs to be closed. The dominance of SPASS in auto mode is thus not surprising.
For the BUMG methods, a general deterioration in performance could be observed for shifting, when comparing the results for the groups with baselines sh • rr and sh • crr to the respective groups without shifting. This is plausible because shifting leads to fewer negative literals in clauses and more positive literals thus reducing the constraining effect and leading to more splitting. For problems with higher rating, shifting did seem to have a positive effect; for instance, in the (0.40, 0.50] range, sh • rr solved 70 problems whereas rr solved 32 problems. Within the BUMG groups we expected best performance for the baseline transformations, because these do not involve blocking and performing many blocking steps lead to a significant overhead. However only for the first group the rr transformation fared best. In combination with classical range-restriction crr, somewhat surprisingly, the best results were obtained with unrestricted do-main blocking, the most expensive form of blocking, because it is applicable to any terms. Among the blocking techniques in each case the highest gain was obtained for unrestricted domain blocking (see Table 5). However also the greatest loss was observed for this blocking technique. The smallest loss and lowest gain was obtained for upb blocking. The high loss for udb could be a reflection of the high increase in splitting steps preventing quicker detection of contradictions. Analogously the small loss for upb could be attributable to the smallest number of additional splitting steps among the blocking techniques. The high gain for udb blocking suggests the inference process panned out significantly differently leading to solutions not found with the other techniques. This seems to be supported by the results in the third row of Table 6 according to which, with one exception, the largest number of uniquely solved problems in each group was obtained with udb blocking. The exception was the first group, where rr led to the largest number of uniquely solved problems. Among all the BUMG methods, rr solved the largest number of problems not solved by any of the other methods. However, these results pale against the number of uniquely solved problems by SPASS in auto mode. Only one problem was solved by a BUMG method which was not solved by SPASS in auto mode.

Findings
Several findings can be drawn from the results. The results have confirmed our expectation that unrestricted domain blocking is a powerful technique, which helps discover finite models more often than with the other blocking techniques. The results suggest the technique is indispensable for bottom-up model generation. Both in combination with the new range-restricting transformation, and the classical range-restricting transformation, good results have been obtained. Overall, the method based on new range-restriction, shifting and unrestricted domain blocking performed best on the sample. On satisfiable problems with higher difficulty rating this method was however gradually edged out by the method based on classical range-restriction and unrestricted domain blocking. This suggests there is a trade-off between the rr transformation, which is based on a non-trivial transformation but does restrict the creation of terms, and the simpler crr transformation, which has to rely on blocking to restrict the creation of terms.
The results for subterm domain blocking were good and often not far behind unrestricted domain blocking for satisfiable problems. In contrast, predicate blocking seems not to be effective on many problems. We attribute this to the nature of the problems in the TPTP library.
An investigation with SPASS-Yarralumla on translations of modal logic problems has revealed a different picture [Schmidt et al., 2014]. There, the best performance was obtained with subterm domain blocking for both satisfiable and unsatisfiable problems. Better results than for unrestricted domain blocking were also obtained with subterm predicate blocking and unrestricted predicate blocking. Better performances for subterm and predicate blocking are also expected on problems stemming from (cyclic) description logic knowledge bases. Experiments with blocking restricted by excluding a finite subset of the domain have shown better results than for unrestricted domain blocking for consistency testing on a large corpus of ontologies [Khodadadi et al., 2013]. The better performance for restricted forms of blocking on modal and description   logic problems can be attributed to mainstream modal and description logics having the finite tree model property. This means every satisfiable formula holds in a model based on a finite tree, which is not a property of first-order formulae.
The results showed BUMG methods were good for disproving theorems and generating models for satisfiable problems. For unsatisfiable problems BUMG methods were however significantly less efficient than SPASS in auto mode. For theorem proving purposes a limitation of BUMG methods is that they require full grounding. It can be seen already from very small unsatisfiable examples that a complete BUMG derivation tree can be very large, whereas resolution proofs are significantly shorter.
Compared to resolution, an advantage of BUMG methods for satisfiable problems is the division of the search space into branches which are individually constructed and individually processed. As a consequence, if the right decisions are made at branching points models can be found more quickly. When the branching point decisions are less optimal the performance can deteriorate dramatically, particularly if the search is trapped in a branch with only infinite models. This could be another explanation for the lower success rate of the BUMG methods observed for more difficult satisfiable problems. For problems where only infinite models exist, clearly other methods are better.

Conclusions
We have presented and tested a number of enhancements for BUMG methods. An important aspect is that our enhancements exploit the strengths of readily available BUMG system with only modest modifications. Our range restriction technique is a refinement of existing transformations to range-restricted clauses in that terms are added to the domain of interpretation on a 'by need' basis. Moreover, we have presented methods that allow us to extend BUMG methods with blocking techniques related to loop checking techniques with a long history in the more specialized setting of modal and description logics.
The experimental evaluation has shown blocking techniques are indispensable in BUMG methods for satisfiable problems. In particular, unrestricted domain blocking turned out to be the most powerful technique on problems from the TPTP library. Limiting the creation of terms during the inference process by using the new range restricting transformation paid off, leading to better results. It is particularly advisable together with the shifting transformation. The experimental results however also show that classical range restriction together with unrestricted blocking is a good complementary method. Because model generation methods are not just aimed at showing the existence of models but are built to construct and return models, when no models exists the entire search space must be traversed, which has led to inferior performance compared to saturation-based resolution.
Our bottom-up model generation approach is especially suitable for generating small models and it is possible to show the approach using unrestricted domain blocking allows us to compute finite models when they exist. The models produced by subterm blocking and predicate blocking are not as small as those produced by unrestricted domain blocking. In particular, the generated models do not need to be Herbrand models. It follows from how the transformations work that the generated models are quasi-Herbrand models, in the following sense. Whenever dom(s) and dom(t) hold in the (Herbrand) model constructed by the BUMG method, then (as in Herbrand interpretations) the terms s and t are mapped to themselves in the associated (possibly non-Herbrand) model. Reconsidering the example in the Introduction of the two unit clauses P(a) and Q(b), the associated model maps a and b to themselves, regardless as to which transformations are applied as long as it includes a form of subterm blocking. In this way, more informative models are produced than those computed by, for example, MACE-and SEM-style finite model searchers (and also unrestricted domain blocking). From an applications perspective, this can be an advantage because larger models are more likely to be helpful to a user debugging mistakes in the formal specification of a program or protocol, or an ontology engineer trying to discover why an expected entailment does not follow from an ontology.
Research in automated theorem proving on developing decision procedures has concentrated on developing refinements of resolution, mainly ordering refinements, for deciding solvable fragments of first-order logic. Fragments decidable with ordered resolution are complementary to the fragments that can be decided by refinements using the techniques presented in this paper. We have thus extended the set of techniques available for resolution methods to turn them into more effective and efficient (terminating) automated reasoning methods. In particular, we have shown that all procedures based on hyperresolution, or BUMG methods, can decide the Bernays-Schönfinkel class and the class of BS clauses with equality.
Studying how well the ideas and techniques discussed in this paper can be exploited and behave in dedicated BUMG provers, tableau-based provers and other provers (including resolution-based provers) is very important but is beyond the scope of the present paper. Initial results with another prover, Darwin , are very encouraging. An in-depth comparison and analysis of BUMG approaches with our techniques and MACE-style or SEM-style model generation would also be of interest. Another source for future work is to combine the presented transformations with other BUMG techniques, such as magic sets transformations [Hasegawa et al., 1997, Stickel, 1994, a typed version of range-restriction [Baumgartner et al., 1997], and minimal model computation [Bry and Yahya, 2000, Bry and Torge, 1998, Papacchini and Schmidt, 2011. Having been designed to be generic, we believe that our transformations carry over to formalisms with default negation, which could provide a possible basis for enhancements to answer-set programming systems.