Of course, a model completion may not exist at all. Proposition 4.5 shows that it exists in case T is a DB theory axiomatized by universal one-variable formulae and \(\varSigma \) is acyclic. The second hypothesis is unnecessarily restrictive and the algorithm for quantifier elimination suggested by the proof of Proposition 4.5 is highly impractical: for this reason we are trying a different approach. In this section, we drop the acyclicity hypothesis and examine the case where the theory T is empty and the signature \(\varSigma \) may contain function symbols of any arity. Covers in this context were shown to exist already in [37], using an algorithm that, very roughly speaking, determines all the conditional equations that can be derived concerning the nodes of the congruence closure graph. An algorithm for the generation of interpolants, still relying on congruence closure [40], is sketched in [41].
We follow a different plan and we want to produce covers (and show that they exist) using saturation-based theorem proving. The natural idea to proceed in this sense is to take the matrix \(\phi (\underline{e},\underline{y})\) of the primitive formula \(\exists \underline{e}\, \phi (\underline{e}, \underline{y})\) we want to compute the cover of: this is a conjunction of literals, so we consider each variable as a free constant, we saturate the corresponding set of ground literals and finally we output the literals involving only the \(\underline{y}\). For saturation, one can use any version of the superposition calculus [54]. However, this procedure for our problem is not sufficient. As a trivial counterexample consider the primitive formula \(\exists e \,(R(e, y_1)\wedge \lnot R(e, y_2))\): the set of literals \(\{ R(e, y_1), \lnot R(e, y_2)\}\) is saturated (recall that we view \(e,y_1,y_2\) as constants), however the formula has a non-trivial cover \(y_1\ne y_2\) which is not produced by saturation. If we move to signatures with function symbols, the situation is even worse: the set of literals \(\{ f(e,y_1)=y'_1, f(e,y_2)= y'_2\}\) is saturated but the formula \(\exists e\, (f(e,y_1)=y'_1\wedge f(e,y_2)= y'_2)\) has the conditional equality \(y_1=y_2\rightarrow y'_1=y'_2\) as cover. Disjunctions of disequations might also arise: the cover of \(\exists e\, h(e,y_1,y_2)\ne h(e, y'_1,y'_2)\) (as well as the cover of \(\exists e\, f(f(e,y_1),y_2)\ne f(f(e,y_1'),y_2')\), see Example 5.5 below) is \(y_1\ne y'_1\vee y_2\ne y'_2\). Footnote 5
Notice that our problem is different from the problem of producing ordinary quantifier-free interpolants via saturation-based theorem proving [42]: for ordinary Craig interpolants, we have as input two quantifier-free formulae \(\phi (\underline{e}, \underline{y}), \phi '(\underline{y}, \underline{z})\) such that \(\phi (\underline{e},\underline{y})\rightarrow \phi '(\underline{y}, \underline{z})\) is valid; here we have a single formula \(\phi (\underline{e}, \underline{y})\) as input and we are asked to find an interpolant which is good for all possible \(\phi '(\underline{y}, \underline{z})\) such that \(\phi (\underline{e},\underline{y})\rightarrow \phi '(\underline{y}, \underline{z})\) is valid. Ordinary interpolants can be extracted from a refutation of \(\phi (\underline{e},\underline{y})\wedge \lnot \phi '(\underline{y}, \underline{z})\), whereas here we are not given any refutation at all (and we are not even supposed to find one).
What we are going to show is that, nevertheless, saturation via superposition can be used to produce covers, if suitably adjusted. In this section we consider signatures with n-ary function symbols (for all \(n\ge 1\)). For simplicity, we omit n-ary relation symbols (they can be easily handled by rewriting \(R(t_1, \dots , t_n)\) as \(R(t_1, \dots , t_n)=true\), as customary in the paramodulation literature [54]).
We are going to compute the cover of a primitive formula \(\exists \underline{e}\,\phi (\underline{e}, \underline{y})\) to be fixed for the remainder of this section. We call variables \(\underline{e}\) existential and variables \(\underline{y}\) parameters. By applying abstraction steps, we can assume that \(\phi \) is primitive flat, i.e. that it is a conjunction of \(\underline{e}\)-flat literals, defined below. [By an abstraction step we mean replacing \(\exists \underline{e}\,\phi \) with \(\exists \underline{e}\, \exists e' \,(e'= u\wedge \phi ')\), where \(e'\) is a fresh variable and \(\phi '\) is obtained from \(\phi \) by replacing some occurrences of a term \(u(\underline{e}, \underline{y})\) by \(e'\)].
A term or a formula are said to be \(\underline{e}\)-free iff the existential variables do not occur in it. An \(\underline{e}\)-flat term is an \(\underline{e}\)-free term \(t(\underline{y})\) or a variable from \(\underline{e}\) or again it is of the kind \(f(u_1, \dots , u_n)\), where f is a function symbol and \(u_1, \dots , u_n\) are \(\underline{e}\)-free terms or variables from \(\underline{e}\). An \(\underline{e}\)-flat literal is a literal of the form
$$\begin{aligned} t=a, \quad a\ne b \end{aligned}$$
where t is an \(\underline{e}\)-flat term and a, b are either \(\underline{e}\)-free terms or variables from \(\underline{e}\).
We assume the reader is familiar with standard conventions used in rewriting and paramodulation literature: in particular \(s_{\vert p}\) denotes the subterm of s in position p and \(s[u]_p\) denotes the term obtained from s by replacing \(s_{\vert p}\) with u. We use \(\equiv \) to indicate coincidence of syntactic expressions (as strings) to avoid confusion with equality symbol; when we write equalities like \(s=t\) below, we may mean both \(s=t\) or \(t=s\) (an equality is seen as a multiset of two terms). For information on reduction orderings, see for instance [2].
We first replace variables \(\underline{e}=e_1,\dots ,e_{n}\) and \(\underline{y}= y_1, \dots , y_{m}\) by free constants - we keep the names \(e_1,\dots ,e_{n}, y_1, \dots , y_{m}\) for these constants. Let > be a reduction ordering that is total for ground terms such that \(\underline{e}\)-flat literals \(t=a\) are always oriented from left to right in the following two cases: (i) t is not \(\underline{e}\)-free and a is \(\underline{e}\)-free; (ii) t is not \(\underline{e}\)-free, it is not equal to any of the \(\underline{e}\) and a is a variable from \(\underline{e}\). To obtain such properties, one may for instance choose a suitable Knuth-Bendix ordering taking weights in some transfinite ordinal (see, e.g., [46]).
Given two \(\underline{e}\)-flat terms t, u, we indicate with E(t, u) the following procedure, which intuitively is a unification algorithm for the terms t and u where the \(\underline{e}\) variables are treated as constants; as shown by Lemma 5.1 below, E(t, u) collects ‘the equalities that are needed in order to force \(t=u\)’, whenever the \(\underline{e}\) are assumed to be free (i.e. not to satisfy any specific equational constraint):
-
E(t, u) fails if t is \(\underline{e}\)-free and u is not \(\underline{e}\)-free (or vice versa);
-
E(t, u) fails if \(t\equiv e_i\) and (either \(u\equiv f(t_1, \dots , t_k)\) or \(u\equiv e_j\) for \(i\ne j\));
-
\(E(t,u)=\emptyset \) if \(t\equiv u\);
-
\(E(t,u)=\{t=u\}\) if t and u are different but both \(\underline{e}\)-free;
-
E(t, u) fails if none of t, u is \(\underline{e}\)-free, \(t\equiv f(t_1,\dots , t_k)\) and \(u\equiv g(u_1,\dots , u_l)\) for \(f\not \equiv g\);
-
\(E(t,u)=E(t_1,u_1)\cup \cdots \cup E(t_k,u_k)\) if none of t, u is \(\underline{e}\)-free, \(t\equiv f(t_1,\dots , t_k)\), \(u\equiv f(u_1,\dots , u_k)\) and none of the \(E(t_i, u_i)\) fails.
Notice that, whenever E(t, u) succeeds, the formula \(\bigwedge E(t,u)\rightarrow t=u\) is universally valid. The definition of E(t, u) is motivated by the next lemma.
Lemma 5.1
Let R be a convergent (i.e. terminating and confluent) ground rewriting system, whose rules consist of \(\underline{e}\)-free terms. Suppose that t and u are \(\underline{e}\)-flat terms with the same R-normal form. Then E(t, u) does not fail and all pairs from E(t, u) have the same R-normal form as well. \(\lhd \)
Proof
This is due to the fact that if t is not \(\underline{e}\)-free, no R-rewriting is possible at root position because rules from R are \(\underline{e}\)-free. \(\square \)
In the following, we handle constrained ground flat literals of the form \(L\,\Vert \, C\) where L is a ground flat literal and C is a conjunction of ground equalities among \(\underline{e}\)-free terms. The logical meaning of \(L\,\Vert \, C\) is the Horn clause \(\bigwedge C\rightarrow L\).
In the literature, various calculi with constrained clauses were considered, starting, e.g., from the non-ground constrained versions of the Superposition Calculus of [4, 53]. The calculus we propose here is inspired by such versions and it has close similarities with a subcase of hierarchic superposition calculus [5], or rather to its “weak abstraction” variant from [6] (we thank an anonymous referee of our CADE 2019 submission for pointing out this connection).
The rules of our Constrained Superposition Calculus follow; each rule applies provided the E subprocedure called by it does not fail. The symbol \(\bot \) indicates the empty clause. Further explanations and restrictions to the calculus are given in the Remarks below.
Remark 5.1
The first three rules are inference rules: they are non-deterministically selected for application, until no rule applies anymore. The selection strategy for the rule to be applied is not relevant for the correctness and completeness of the algorithm (some variant of a ‘given clause algorithm’ can be applied). An inference rule is not applied in case one premise is \(\underline{e}\)-free (we have no reason to apply inferences to \(\underline{e}\)-free premises, since we are not looking for a refutation). \(\lhd \)
Remark 5.2
The Demodulation rule is a simplification rule: its application not only adds the conclusion to the current set of constrained literals, but it also removes the first premise. It is easy to see (e.g., representing literals as multisets of terms and extending the total reduction ordering to multisets), that one cannot have an infinite sequence of consecutive applications of Demodulation rules. \(\lhd \)
Remark 5.3
The calculus takes \(\{L\Vert \emptyset ~\mid ~ L\) is a flat literal from the matrix of \(\phi \}\) as the initial set of constrained literals. It terminates when a saturated set of constrained literals is reached. We say that S is saturated iff every constrained literal that can be produced by an inference rule, after being exhaustively simplified via Demodulation, is already in S (there are more sophisticated notions of ‘saturation up to redundancy’ in the literature, but we do not need them). When it reaches a saturated set S, the algorithm outputs the conjunction of the clauses \(\bigwedge C\rightarrow L\), varying \(L\,\Vert \, C\) among the \(\underline{e}\)-free constrained literals from S. \(\lhd \)
We need some rule application policy to ensure termination: without any such policy, a set like
$$\begin{aligned} \{e=y\,\Vert \, \emptyset , f(e)=e\Vert \,\emptyset \} \end{aligned}$$
(4)
may produce by Right Superposition the infinitely many literals (all oriented from right to left) \(f(y)=e\,\Vert \, \emptyset \), \(f(f(y))=e\,\Vert \, \emptyset \), \(f(f(f(y)))=e\,\Vert \, \emptyset \), etc.
The next remark explains the policy we follow.
Remark 5.4
[Policy Remark] We apply Demodulation only in case the second premise is of the kind \(e_j=t(\underline{y})\, \Vert D\), where t is \(\underline{e}\)-free.
Demodulation rule is applied with higher priority with respect to the inference rules.Footnote 6 Inside all possible applications of Demodulation rule, we give priority to the applications where both premises have the form \(e_j=t(\underline{y})\, \Vert D\) (for the same \(e_j\) but with possibly different D’s - the D from the second premise being included in the D of the first). In case we have two constrained literals of the kind \(e_j=t_1(\underline{y})\, \Vert D\), \(e_j=t_2(\underline{y})\, \Vert D\) inside our current set of constrained literals (notice that the \(e_j\)’s and the D’s here are the same), among the two possible applications of the Demodulation rule, we apply the rule that keeps the smallest \(t_i\). Notice that in this way two different constrained literals cannot simplify each other. \(\lhd \)
We say that a constrained literal \(L\, \Vert C\) belonging to a set of constrained literals S is simplifiable in S iff it is possible to apply (according to the above policy) a Demodulation rule removing it. A first effect of our policy is:
Lemma 5.2
If a constrained literal \(L\,\Vert \, C\) is simplifiable in S, then after applying to S any sequence of rules, it remains simplifiable until it gets removed. After being removed, if it is regenerated, it is still simplifiable and so it is eventually removed again. \(\lhd \)
Proof
Suppose that \(L\,\Vert \, C\) can be simplified by \(e=t\,\Vert \, D\) and suppose that a rule is applied to the current set of constrained literals. Since there are simplifiable constrained literals, that rule cannot be an inference rule by the priority stated in Remark 5.4. For simplification rules, keep in mind again Remark 5.4. If \(L\,\Vert \, C\) is simplified, it is removed; if none of \(L\,\Vert \, C\) and \(e=t\,\Vert \, D\) get simplified, the situation does not change; if \(e=t\,\Vert \, D\) gets simplified, this can be done by some \(e=t'\Vert \,D'\), but then \(L\,\Vert \, C\) is still simplifiable - although in a different way - using \(e=t'\Vert \,D'\) (we have that \(D'\) is included in D, which is in turn included in C). Similar observations apply if \(L\,\Vert \, C\) is removed and re-generated. \(\square \)
Due to Lemma 5.2, if we show that a derivation (i.e., a sequence of applications of rules) can produce terms only from a finite set, it is clear that when no new constrained literal is produced, saturation is reached. First notice that:
Lemma 5.3
Every constrained literal \(L\,\Vert C\) produced during the run of the algorithm is \(\underline{e}\)-flat.
\(\lhd \)
Proof
The constrained literals from initialization are \(\underline{e}\)-flat. The Demodulation rule, applied according to Remark 5.4, produces an \(\underline{e}\)-flat literal out of an \(\underline{e}\)-flat literal. The same happens for the Superposition rules: in fact, since both the terms s and l from these rules are \(\underline{e}\)-flat, a Superposition may take place at root position or may rewrite some \(l\equiv e_j\) with \(r\equiv e_i\) or with \(r\equiv t(\underline{y})\).Footnote 7\(\square \)
There are in principle infinitely many \(\underline{e}\)-flat terms that can be generated out of the \(\underline{e}\)-flat terms occurring in \(\phi \) (see the above counterexample (4)). We show however that only finitely many \(\underline{e}\)-flat terms can in fact occur during saturation and that one can determine in advance the finite set they are taken from.
To formalize this idea, let us introduce a hierarchy of \(\underline{e}\)-flat terms (this hierarchy concerns terms, not clauses or constraints - although it will be used to delimit the kind of clauses or constraints that might occur in a saturation process). Let \(D_0\) be the \(\underline{e}\)-flat terms occurring in \(\phi \) and let \(D_{k+1}\) be the set of \(\underline{e}\)-flat terms obtained by simultaneous rewriting of an \(\underline{e}\)-flat term from \(\bigcup _{i\le k} D_i\) via rewriting rules of the kind \(e_j\rightarrow t_j(\underline{y})\) where the \(t_j\) are \(\underline{e}\)-free terms from \(\bigcup _{i\le k} D_i\). The degree of an \(\underline{e}\)-flat term is the minimum k such that it belongs to set \(D_k\) (it is necessary to take the minimum because the same term can be obtained at different stages and via different rewritings).
Lemma 5.4
Let the \(\underline{e}\)-flat term \(t'\) be obtained by a rewriting \(e_j\rightarrow u(\underline{y})\) from the \(\underline{e}\)-flat term t; then, if t has degree \(k>1\) and u has degree at most \(k-1\), we have that \(t'\) has degree at most k. \(\lhd \)
Proof
This is clear, because at the k-stage one can directly produce \(t'\) instead of just t: in fact, all rewriting producing directly \(t'\) replace an occurrence of some \(e_i\) by an \(\underline{e}\)-free term, so they are all done in parallel positions.
[We illustrate the phenomenon via an example: suppose that t is \(f(e_1, g(g(c)))\) and that \(t'\) is obtained from t by rewriting \(e_1\) to g(c). Now it might well be that t has degree 2, being obtained from \(f(e_1,e_2)\) via \(e_2\mapsto g(g(c)))\) (the latter having been previously obtained from \(g(e_3)\) via \(e_3\mapsto g(c)\)). Now \(t'\) still has degree 2 because it can be directly obtained from \(f(e_1,e_2)\) via the parallel rewritings \(e_1\mapsto g(c)\), \(e_2\mapsto g(g(c)))\).] \(\square \)
Proposition 5.5
The saturation of the initial set of \(\underline{e}\)-flat constrained literals always terminates after finitely many steps. \(\lhd \)
Proof
We show that all \(\underline{e}\)-flat terms that may occur during saturation have at most degree n (where n is the cardinality of \(\underline{e}\)). This shows that the saturation must terminate, because only finitely many terms may occur in a derivation (see the above observations).
Let the algorithm during saturation reach the state S; we say that a constraint C allows the explicit definition of \(e_j\) in S iff S contains a constrained literal of the kind \(e_j=t(\underline{y})\, \Vert D\) with \(D\subseteq C\). Now we show by mutual induction two facts concerning a constrained literal \(L\,\Vert \, C\in S\):
-
(1)
if an \(\underline{e}\)-flat term u of degree k occurs in L, then C allows the explicit definition of k different \(e_j\) in S;
-
(2)
if L is of the kind \(e_i=t(\underline{y})\), for an \(\underline{e}\)-free term t of degree k, then either \(e_i=t\,\Vert \, C\) can be simplified in S or C allows the explicit definition of \(k+1\) different \(e_j\) in S (\(e_i\) itself is of course included among these \(e_j\)).
Notice that (1) is sufficient to exclude that any \(\underline{e}\)-flat term of degree bigger than n can occur in a constrained literal arising during the saturation process.
We prove (1) and (2) by induction on the length of the derivation leading to \(L\,\Vert \, C\in S\). Notice that it is sufficient to check that (1) and (2) hold for the first time where \(L\,\Vert \, C\in S\) because if C allows the explicit definition of a certain variable in S, it will continue to do so in any \(S'\) obtained from S by continuing the derivation (the definition may be changed by the Demodulation rule, but the fact that \(e_i\) is explicitly defined is forever). Also, by Lemma 5.2, a literal cannot become non simplifiable if it is simplifiable.
(1) and (2) are evident if S is the initial status. To show (1), suppose that u occurs for the first time in \(L\,\Vert \,C\) as the effect of the application of a certain rule: we can freely assume that u does not occur in the literals from the premisses of the rule (otherwise induction trivially applies) and that u of degree k
is obtained by rewriting in a non-root position some \(u'\) occurring in a constrained literal \(L'\,\Vert \, D'\) via some \(e_j\rightarrow t\, \Vert \, D\). This might be the effect of a Demodulation or Superposition in a non-root position (Superpositions in root position do not produce new terms).
If \(u'\) has degree k, then by induction \(D'\) contains the required k explicit definitions, and we are done because \(D'\) is included in C. If \(u'\) has lower degree, then t must have degree at least \(k-1\) (otherwise u does not reach degree k by Lemma 5.4). Then by induction on (2), the constraint D (also included in C) has \((k-1)+1=k\) explicit definitions (when a constraint \(e_j\rightarrow t\, \Vert D\) is selected for Superposition or for making Demodulations in a non-root position, it is itself not simplifiable according to the procedure explained in Remark 5.4).
To show (2), we analyze the reasons why the non simplifiable constrained literal \(e_i=t(\underline{y})\, \Vert \, C\) is produced (let k be the degree of t).
Suppose it is produced from \(e_i=u'\,\Vert \, C\) via Demodulation with \(e_j= u(\underline{y})\, \Vert \, D\) (with \(D\subseteq C\)) in a non-root position; if \(u'\) has degree at least k,
we apply induction for (1) to \(e_i=u'\,\Vert \, C\): by such induction hypotheses, we get k explicit definitions in C and we can add to them the further explicit definition \(e_i=t(\underline{y})\) (the explicit definitions from C cannot concern \(e_i\) because \(e_i=t(\underline{y})\, \Vert \, C\) is not simplifiable). Otherwise, \(u'\) has degree less than k and u has degree at least \(k-1\) by Lemma 5.4
(recall that t has degree k):
by induction, \(e_j= u\, \Vert \, D\) is not simplifiable (it is used as the active part of a Demodulation in a non-root position, see Remark 5.4) and supplies k explicit definitions, inherited by \(C\supseteq D\). Note that \(e_i\) cannot have a definition in D, otherwise \(e_i=t(\underline{y})\,\Vert \, C\) would be simplifiable, so with \(e_i=t(\underline{y})\,\Vert \, C\) we get the required \(k+1\) definitions.
The remaining case is when \(e_i=t(\underline{y})\, \Vert \, C\) is produced via Superposition Right. Such a Superposition might be at root or at a non-root position. We first analyse the case of a root position. This might be via \(e_j=e_i\,\Vert \, C_1\) and \(e_j=t(\underline{y})\, \Vert \, C_2\) (with \(e_j>e_i\) and \(C=C_1\cup C_2\) because \(E(e_j,e_j)=\emptyset \)), but in such a case one can easily apply induction. Otherwise, we have a different kind of Superposition at root position: \(e_i=t(\underline{y})\, \Vert \, C\) is obtained from \(s=e_i\,\Vert \, C_1\) and \(s'=t(\underline{y})\, \Vert \, C_2\), with \(C=C_1\cup C_2\cup E(s,s')\). In this case, by induction for (1), \(C_2\) supplies k explicit definitions, to be inherited by C. Among such definitions, there cannot be an explicit definition of \(e_i\) otherwise \(e_i=t(\underline{y})\, \Vert \, C\) would be simplifiable, so again we get the required \(k+1\) definitions.
In case of a Superposition at a non root-position, we have that \(e_i=t(\underline{y})\, \Vert \, C\) is obtained from \(u'=e_i\,\Vert \, C_1\) and \(e_j=u(\underline{y})\, \Vert \, C_2\), with \(C=C_1\cup C_2\); here t is obtained from \(u'\) by rewriting \(e_j\) to u. This case is handled similarly to the case where \(e_i=t(\underline{y})\, \Vert \, C\) is obtained via Demodulation rule. \(\square \)
Having established termination, we now prove that our calculus computes covers. To this aim, we rely on refutational completeness of unconstrained Superposition Calculus: thus, our technique resembles the technique used [5, 6] in order to prove refutational completeness of hierarchic superposition, although it is not clear whether Theorem 5.6 below can be derived from the results concerning hierarchic superpositionFootnote 8.
We state the following theorem:
Theorem 5.6
Let T be the theory \(\mathcal {EUF}\). Suppose that the above algorithm, taking as input the primitive \(\underline{e}\)-flat formula \(\exists \underline{e}\,\phi (\underline{e}, \underline{y})\), gives as output the quantifier-free formula \(\psi (\underline{y})\). Then the latter is a T-cover of \(\exists \underline{e}\,\phi (\underline{e}, \underline{y})\). \(\lhd \)
Proof
Let S be the saturated set of constrained literals produced upon termination of the algorithm; let \(S=S_1\cup S_2\), where \(S_1\) contains the constrained literals in which the \(\underline{e}\) do not occur and \(S_2\) is its complement. Clearly \(\exists \underline{e}\,\phi (\underline{e}, \underline{y})\) turns out to be logically equivalent to
$$\begin{aligned} \bigwedge _{L\,\Vert \, C\in S_1} (\bigwedge C\rightarrow L) \wedge \exists \underline{e}\, \bigwedge _{L\,\Vert \, C\in S_2} (\bigwedge C\rightarrow L) \end{aligned}$$
so, as a consequence, in view of Lemma 3.1 it is sufficient to show that every model \(\mathcal M\) satisfying \(\bigwedge _{L\,\Vert \, C\in S_1} (\bigwedge C\rightarrow L)\) via an assignment \(\mathcal I\) to the variables \(\underline{y}\) can be embedded into a model \(\mathcal M'\) such that for a suitable extension \(\mathcal I'\) of \(\mathcal I\) to the variables \(\underline{e}\) we have that \((\mathcal M', \mathcal I')\) satisfies also \(\bigwedge _{L\,\Vert \, C\in S_2} (\bigwedge C\rightarrow L)\).
Fix \(\mathcal M, \mathcal I\) as above. The diagram \(\varDelta (\mathcal M)\) of \(\mathcal M\) is obtained as follows. We take one free constant for each element of the support of \(\mathcal M\) (by Löwenheim-Skolem theorem one can keep \(\mathcal M\) at most countable, if you like) and we put in \(\varDelta (\mathcal M)\) all the literals of the kind \(f(c_1, \dots , c_k)=c_{k+1}\) and \(c_1\ne c_2\) which are true in \(\mathcal M\) (here the \(c_i\) are names for the elements of the support of \(\mathcal M\)). Let R be the set of ground equalities of the form \(y_i=c_i\), where \(c_i\) is the name of \(\mathcal I(y_i)\). Extend our reduction ordering in the natural way (so that \(y_i=c_i\) and \(f(c_1, \dots , c_k)=c_{k+1}\) are oriented from left to right). Consider now the set of clauses
$$\begin{aligned} \varDelta (\mathcal M)~\cup ~ R~\cup ~ \{\bigwedge C\rightarrow L\mid (L\,\Vert \, C)\in S\} \end{aligned}$$
(5)
(below, we distinguish the positive and the negative literals of \(\varDelta (\mathcal M)\) so that \(\varDelta (\mathcal M)=\varDelta ^+(\mathcal M)\cup \varDelta ^-(\mathcal M)\)). We want to saturate the above set in the standard Superposition Calculus. Clearly the rewriting rules in R, used as reduction rules, replace everywhere \(y_i\) by \(c_i\) inside the clauses of the kind \(\bigwedge C\rightarrow L\). At this point, the negative literals from the equality constraints all disappear: if they are true in \(\mathcal M\), they \(\varDelta ^+(\mathcal M)\)-normalize to trivial equalities \(c_i = c_i\) (to be eliminated by standard reduction rules) and if they are false in \(\mathcal M\) they become part of clauses subsumed by true inequalities from \(\varDelta ^-(\mathcal M)\). Similarly all the \(\underline{e}\)-free literals not coming from \(\varDelta (\mathcal M)\cup R\) get removed. Let \({\tilde{S}}\) be the set of survived literals involving the \(\underline{e}\) (they are not constrained anymore and they are \(\varDelta ^+(\mathcal M)\cup R\)-normalized): we show that they cannot produce new clauses. Let in fact \((\pi )\) be an inference from the Superposition Calculus [54] applying to them. Since no superposition with \(\varDelta (\mathcal M)\cup R\) is possible, this inference must involve only literals from \({\tilde{S}}\); suppose it produces a literal \({\tilde{L}}\) from the literals \({\tilde{L}}_1, {\tilde{L}}_2\) (coming via \(\varDelta ^+(\mathcal M)\cup R\)-normalization from \(L_1\,\Vert \, C_1\in S\) and \(L_2\,\Vert \, C_2\in S\)) as parent clauses. Then, by Lemma 5.1, our constrained inferences produce a constrained literal \(L\,\Vert \, C\) such that the clause \(\bigwedge C\rightarrow L\) normalizes to \({\tilde{L}}\) via \(\varDelta ^+(\mathcal M)\cup R\). Since S is saturated, the constrained literal \(L\,\Vert \, C\), after simplification, belongs to S. Now simplifications via our Constrained Demodulation and \(\varDelta (\mathcal M)^+\cup R\)-normalization commute (they work at parallel positions, see Remark 5.4), so the inference \((\pi )\) is redundant because \({\tilde{L}}\) simplifies to a literal already in \({\tilde{S}}\cup \varDelta (\mathcal M)\).
Thus the set of clauses (5) saturates without producing the empty clause. By the completeness theorem of the Superposition Calculus [3, 39, 54] it has a model \(\mathcal M'\). This \(\mathcal M'\) by construction fits our requests by Robinson Diagram Lemma. \(\square \)
Theorem 5.6, thanks to the relationship between model completions and covers stated in Theorem 3.2, proves also the existence of the model completion of \(\mathcal {EUF}\).
Example 5.5
We compute the cover of the primitive formula \(\exists e\, f(f(e,y_1),y_2)\ne f(f(e,y_1'),y_2')\). Flattening gives the set of literals
$$\begin{aligned} \{~f(e,y_1)=e_1, ~f(e_1,y_2)=e'_1,~f(e,y_1')=e_2,~ f(e_2,y_2')=e'_2,~ e'_1\ne e'_2 ~\}~~. \end{aligned}$$
Superposition Right produces the constrained literal \( e_1= e_2 \,\Vert \, \{y_1=y'_1\}\); supposing that we have \(e_1> e_2\), Superposition Right gives first \(f(e_2,y_2)=e'_1 \,\Vert \, \{y_1=y'_1\}\) and then also \( e'_1=e'_2\,\Vert \, \{y_1=y'_1, y_2=y'_2\}\). Superposition Left and Reflection now produce \( \bot \,\Vert \, \{y_1=y'_1, y_2=y'_2\}\). Thus the clause \(y_1=y'_1 \wedge y_2=y'_2 \rightarrow \bot \) will be part of the output (actually, this will be the only clause in the output). \(\lhd \)
We apply our algorithm to an additional example, taken from [37].
Example 5.6
We compute the cover of the primitive formula \(\exists e\, (s_1=f(y_3,e)\wedge s_2=f(y_4,e)\wedge t=f(f(y_1,e), f(y_2,e)))\), where \(s_1, s_2, t\) are terms in \(\underline{y}\). Flattening gives the set of literals
$$\begin{aligned} \{~f(y_3,e)=s_1, ~f(y_4, e)=s_2,~ f(y_1,e)=e_1, ~f(y_2, e)=e_2,~ f(e_1,e_2)=t~\}~~. \end{aligned}$$
Suppose that we have \(e>e_1> e_2>t>s_1>s_2>y_1>y_2>y_3>y_4\). Superposition Right between the 3rd and the 4th clauses produces the constrained 6th clause \(e_1=e_2 \,\Vert \, \{y_1=y_2\}\). From now on, we denote the application of a Superposition Right to the ith and jth clauses with R(i, j). We list a derivation performed by our calculus:
$$\begin{aligned} R(3,4)&\implies&e_1=e_2\,\Vert \, \{y_1=y_2\} \ \ \ \ (6\text{ th } \text{ clause})\\ R(1,2)&\implies&s_1=s_2\,\Vert \, \{y_3=y_4\} \ \ \ \ (7\text{ th } \text{ clause})\\ R(5,6)&\implies&f(e_2,e_2)=t\,\Vert \, \{y_1=y_2\}\ \ \ \ (8\text{ th } \text{ clause})\\ R(1,3)&\implies&e_1=s_1\,\Vert \, \{y_1=y_3\} \ \ \ \ (9\text{ th } \text{ clause})\\ R(1,4)&\implies&e_2=s_1\,\Vert \, \{y_2=y_3\}\ \ \ \ (10\text{ th } \text{ clause})\\ R(2,3)&\implies&e_1=s_2\,\Vert \, \{y_1=y_4\} \ \ \ \ (11\text{ th } \text{ clause})\\ R(2,4)&\implies&e_2=s_2\,\Vert \, \{y_2=y_4\} \ \ \ \ (12\text{ th } \text{ clause})\\ R(5, 9)&\implies&f(s_1,e_2)=t\,\Vert \, \{y_1=y_3\}\ \ \ \ (13\text{ th } \text{ clause})\\ R(5,11)&\implies&f(s_2,e_2)=t\,\Vert \, \{y_1=y_4\}\ \ \ \ (14\text{ th } \text{ clause})\\ R(6,9)&\implies&e_2=s_1\,\Vert \, \{y_1=y_3, y_1=y_2\} \ \ \ \ (15\text{ th } \text{ clause})\\ R(6,11)&\implies&e_2=s_2\,\Vert \, \{y_1=y_2, y_1=y_4\} \ \ \ \ (16\text{ th } \text{ clause})\\ R(8,10)&\implies&f(s_1,s_1)=t\,\Vert \, \{y_1=y_3, y_2=y_3\}\ \ \ \ (17\text{ th } \text{ clause})\\ R(8,12)&\implies&f(s_2,s_2)=t\,\Vert \, \{y_1=y_4, y_2=y_4\} \ \ \ \ (18\text{ th } \text{ clause})\\ R(13,12)&\implies&f(s_1,s_2)=t\,\Vert \, \{y_1=y_3, y_2=y_4\}\ \ \ \ (19\text{ th } \text{ clause})\\ R(14,10)&\implies&f(s_2,s_1)=t\,\Vert \, \{y_1=y_4, y_2=y_3\}\ \ \ \ (20\text{ th } \text{ clause})\\ R(9,11)&\implies&s_1=s_2\,\Vert \, \{y_1=y_3, y_1=y_4\}\ \ \ \ (21\text{ th } \text{ clause}) \end{aligned}$$
The set of clauses above is saturated. The 7th, 17th, 18th, 19th and 20th clauses are exactly the output clauses of [37]. The non-simplified clauses that do not appear as output in [37] are redundant and they could be simplified by introducing a Subsumption rule as an additional simplification rule of our calculus. \(\lhd \)