Keywords

1 Introduction

Generalization problems play an important role in various areas of mathematics, computer science, and artificial intelligence. Anti-unification [12, 14] is a logic-based method for computing generalizations. Being originally used for inductive and analogical reasoning, some recent applications include recursion scheme detection in functional programs [4], programming by examples in domain-specific languages [13], learning bug-fixing from software code repositories [3, 15], automatic program repair [7], preventing bugs and misconfiguration in services [11], linguistic structure learning for chatbots [6], to name just a few.

In most of the existing theories where anti-unification is studied, the background knowledge is assumed to be precise. Therefore, those techniques are not suitable for reasoning with incomplete, imprecise information (which is very common in real-world communication), where the exact equality is replaced by its (quantitative) approximation. Fuzzy proximity and similarity relations are notable examples of such extensions. These kinds of quantitative theories have many useful applications, some most recent ones being related to artificial intelligence, program verification, probabilistic programming, or natural language processing. Many tasks arising in these areas require reasoning methods and computational tools that deal with quantitative information. For instance, approximate inductive reasoning, reasoning and programming by analogy, similarity detection in programming language statements or in natural language texts could benefit from solving approximate generalization constraints, which is a theoretically interesting and challenging task. Investigations in this direction have been started only recently. In [1], the authors proposed an anti-unification algorithm for fuzzy similarity (reflexive, symmetric, min-transitive) relations, where mismatches are allowed not only in symbol names, but also in their arities (fully fuzzy signatures). The algorithm from [9] is designed for fuzzy proximity (i.e., reflexive and symmetric) relations with mismatches only in symbol names.

In this paper, we study approximate anti-unification from a more general perspective. The considered relations are fuzzy proximity relations. Proximal symbols may have different names and arities. We consider four different variants of relating arguments between different proximal symbols: unrestricted relations/functions, and correspondence (i.e. left- and right-total) relations/functions. A generic set of rules for computing minimal complete sets of generalizations is introduced and its termination, soundness and completeness properties are proved. From these rules, we obtain concrete algorithms that deal with different kinds of argument relations. We also show how the existing approximate anti-unification algorithms and their generalizations fit into this framework.

Organization: In Sect. 2 we introduce the notation and definitions. Section 3 is devoted to a technical notion of term set consistency and to an algorithm for computing elements of consistent sets of terms. It is used later in the main set of anti-unification rules, which are introduced and characterized in Sect. 4. The concrete algorithms obtained from those rules are also described in this section. In Sect. 5, we discuss complexity. Section 6 offers a high-level picture of the studied problems and concludes.

An extended version of this work can be found in the technical report [8].

2 Preliminaries

Proximity Relations. Given a set S, a mapping \(\mathcal {R}\) from \(S\times S\) to the real interval [0, 1] is called a binary fuzzy relation on S. By fixing a number \(\lambda \), \(0\le \uplambda \le 1\), we can define the crisp (i.e., two-valued) counterpart of \(\mathcal {R}\), named the \(\uplambda \)-cut of \(\mathcal {R}\), as \(\mathcal {R}_\uplambda := \{(s_1,s_2) \mid \mathcal {R}(s_1,s_2) \ge \uplambda \}\). A fuzzy relation \(\mathcal {R}\) on a set S is called a proximity relation if it is reflexive (\(\mathcal {R}(s,s)=1\) for all \(s\in S\)) and symmetric (\(\mathcal {R}(s_1,s_2)=\mathcal {R}(s_2,s_1)\) for all \(s_1,s_2\in S\)). A T-norm \(\wedge \) is an associative, commutative, non-decreasing binary operation on [0, 1] with 1 as the unit element. We take minimum in the role of T-norm.

Terms and Substitutions. We consider a first-order alphabet consisting of a set of fixed arity function symbols \(\mathcal {F}\) and a set of variables \(\mathcal {V}\), which includes a special symbol \(\_\) (the anonymous variable). The set of named (i.e., non-anonymous) variables \(\mathcal {V}{\setminus }\{\_\}\) is denoted by \(\mathcal {V}^\mathrm{N}\). When the set of variables is not explicitly specified, we mean \(\mathcal {V}\). The set of terms \(\mathcal {T}(\mathcal {F},\mathcal {V})\) over \(\mathcal {F}\) and \(\mathcal {V}\) is defined in the standard way: \(t\in \mathcal {T}(\mathcal {F},\mathcal {V})\) iff t is defined by the grammar \(t := x \mid f(t_1,\ldots ,t_n)\), where \(x\in \mathcal {V}\) and \(f\in \mathcal {F}\) is an n-ary symbol with \(n\ge 0\). Terms over \(\mathcal {T}(\mathcal {F},\mathcal {V}^\mathrm{N})\) are defined similarly except that all variables are taken from \(\mathcal {V}^\mathrm{N}\).

We denote arbitrary function symbols by fgh, constants by abc, variables by xyzv, and terms by str. The head of a term is defined as \(\mathsf {head}(x):=x\) and \(\mathsf {head}(f(t_1,\ldots ,t_n)):=f\). For a term t, we denote with \(\mathcal {V}(t)\) (resp. by \(\mathcal {V}^\mathrm{N}(t)\)) the set of all variables (resp. all named variables) appearing in t. A term is called linear if no named variable occurs in it more than once.

The deanonymization operation \(\mathsf {deanon}\) replaces each occurrence of the anonymous variable in a term by a fresh variable. For instance, \(\mathsf {deanon}(f(\_,x,g(\_)))= f(y',x,g(y'')))\), where \(y'\) and \(y''\) are fresh. Hence, \(\mathsf {deanon}(t)\in \mathcal {T}(\mathcal {F}, \mathcal {V}^\mathrm{N})\) is unique up to variable renaming for all \(t\in \mathcal {T}(\mathcal {F},\mathcal {V})\). \(\mathsf {deanon}(t)\) is linear iff t is linear.

The notions of term depth, term size and a position in a term are defined in the standard way, see, e.g. [2]. By \(t|_p\) we denote the subterm of t at position p and by \(t[s]_p\) a term that is obtained from t by replacing the subterm at position p by the term s.

A substitution is a mapping from \(\mathcal {V}^\mathrm{N}\) to \(\mathcal {T}(\mathcal {F},\mathcal {V}^\mathrm{N})\) (i.e., without anonymous variables), which is the identity almost everywhere. We use the Greek letters \(\sigma ,\vartheta ,\varphi \) to denote substitutions, except for the identity substitution which is written as \( Id \). We represent substitutions with the usual set notation. Application of a substitution \(\sigma \) to a term t, denoted by \(t\sigma \), is defined as \(\_\sigma :=\_\), \(x\sigma :=\sigma (x)\), \(f(t_1,\ldots ,t_n)\sigma := f(t_1\sigma ,\ldots ,t_n\sigma )\). Substitution composition is defined as a composition of mappings. We write \(\sigma \vartheta \) for the composition of \(\sigma \) with \(\vartheta \).

Argument Relations and Mappings. Given two sets \(N=\{1,\ldots ,n\}\) and \(M=\{1,\ldots ,m\}\), a binary argument relation over \(N\times M\) is a (possibly empty) subset of \(N\times M\). We denote argument relations by \(\uprho \). An argument relation \(\uprho \subseteq N \times M\) is (i) left-total if for all \(i\in N\) there exists \(j\in M\) such that \((i,j)\in \uprho \); (ii) right-total if for all \(j\in M\) there exists \(i\in N\) such that \((i,j)\in \uprho \). Correspondence relations are those that are both left- and right-total.

An argument mapping is an argument relation that is a partial injective function. In other words, an argument mapping \(\uppi \) from \(N=\{1,\ldots ,n\}\) to \(M=\{1,\ldots ,m\}\) is a function \(\uppi : I_n \mapsto I_m\), where \(I_n \subseteq N\), \(I_m \subseteq M\) and \(|I_n|=|I_m|\). Note that it can be also the empty mapping: \(\uppi :\emptyset \mapsto \emptyset \). The inverse of an argument mapping is again an argument mapping.

Given a proximity relation \(\mathcal {R}\) over \(\mathcal {F}\), we assume that for each pair of function symbols f and g with \(\mathcal {R}(f, g) = \upalpha >0\), where f is n-ary and g is m-ary, there is also given an argument relation \(\uprho \) over \(\{1,\ldots ,n\}\times \{1,\ldots ,m\}\). We use the notation \(f \sim _{\mathcal {R},\upalpha }^\uprho g\). These argument relations should satisfy the following conditions: \(\uprho \) is the empty relation if f or g is a constant; \(\uprho \) is the identity if \(f=g\); \(f \sim _{\mathcal {R},\upalpha }^\uprho g\) iff \(g \sim _{\mathcal {R},\upalpha }^{\uprho ^{-1}} f\), where \(\uprho ^{-1}\) is the inverse of \(\uprho \).

Example 1

Assume that we have four different versions of defining the notion of author (e.g., originated from four different knowledge bases) \( author _1( first\text{-name } , middle\text{- }initial ,\, last\text{-name } )\), \( author _2( first\text{-name } ,\, last\text{-name } )\), \( author _3( last\text{-name } ,\) \( first\text{-name } ,\, middle\text{- }initial )\), and \( author _4( full\text{-name } )\). One could define the argument relations/mappings between these function symbols e.g., as follows:

$$\begin{aligned}&author _1 \sim _{\mathcal {R},0.7}^{\{(1,1),(3,2)\}} author _2, \quad author _1 \sim _{\mathcal {R},0.9}^{\{(3,1),(1,2),(2,3)\}} author _3, \\&author _1 \sim _{\mathcal {R},0.5}^{\{(1,1),(3,1)\}} author _4, \quad author _2 \sim _{\mathcal {R},0.7}^{\{(1,2),(2,1)\}} author _3, \\&author _2 \sim _{\mathcal {R},0.5}^{\{(1,1),(2,1)\}} author _4, \quad author _3 \sim _{\mathcal {R},0.5}^{\{(1,1),(2,1)\}} author _4. \end{aligned}$$

Proximity Relations over Terms. Each proximity relation \(\mathcal {R}\) in this paper is defined on \(\mathcal {F}\cup \mathcal {V}\) such that \(\mathcal {R}(f,x)=0\) for all \(f\in \mathcal {F}\) and \(x\in \mathcal {V}\), and \(\mathcal {R}(x,y)=0\) for all \(x\ne y\), \(x,y\in \mathcal {V}\). We assume that \(\mathcal {R}\) is strict: for all \(w_1,w_2\in \mathcal {F}\cup \mathcal {V}\), if \(\mathcal {R}(w_1,w_2)=1\), then \(w_1=w_2\). Yet another assumption is that for each \(f\in \mathcal {F}\), its \(({\mathcal {R},\uplambda })\)-proximity class \(\{g \mid \mathcal {R}(f,g)\ge \uplambda \}\) is finite for any \(\mathcal {R}\) and \(\uplambda \).

We extend such an \(\mathcal {R}\) to terms from \(\mathcal {T}(\mathcal {F},\mathcal {V})\) as follows:

  1. (a)

    \(\mathcal {R}(t,s):=0\) if \(\mathcal {R}(\mathsf {head}(s),\mathsf {head}(t))=0\);

  2. (b)

    \(\mathcal {R}(t,s):=1\) if \(t=s\) and \(t,s\in \mathcal {V}\);

  3. (c)

    \(\mathcal {R}(t,s):=\mathcal {R}(f,g)\wedge \mathcal {R}(t_{i_1},s_{j_1})\wedge \cdots \wedge \mathcal {R}(t_{i_k},s_{j_k})\), if \(t=f(t_1,\ldots ,t_n)\), \(s=g(s_1,\ldots ,s_m)\), \(f \sim _{\mathcal {R},\uplambda }^\uprho g\), and \(\uprho =\{ (i_1, j_1),\ldots , (i_k, j_k) \}\).

If \(\mathcal {R}(t,s)\ge \lambda \), we write \(t \simeq _{{\mathcal {R},\uplambda }} s\). When \(\uplambda =1\), the relation \(\simeq _{{\mathcal {R},\uplambda }}\) does not depend on \(\mathcal {R}\) due to strictness of the latter and is just the syntactic equality \(=\).

The \(({\mathcal {R},\uplambda })\)-proximity class of a term t is \(\mathbf{pc}_{\mathcal {R},\uplambda }(t):=\{ s\mid s \simeq _{{\mathcal {R},\uplambda }} t \}\).

Generalizations. Given \(\mathcal {R}\) and \(\lambda \), a term r is an \(({\mathcal {R},\uplambda })\)-generalization of (alternatively, \(({\mathcal {R},\uplambda })\)-more general than) a term t, written as \(r \precsim _{{\mathcal {R},\uplambda }} t\), if there exists a substitution \(\sigma \) such that \(\mathsf {deanon}(r)\sigma \simeq _{{\mathcal {R},\uplambda }} \mathsf {deanon}(t)\). The strict part of \(\precsim _{{\mathcal {R},\uplambda }}\) is denoted by \(\prec _{{\mathcal {R},\uplambda }}\), i.e., \(r \prec _{{\mathcal {R},\uplambda }} t\) if \(r \precsim _{{\mathcal {R},\uplambda }} t\) and not \(t \precsim _{{\mathcal {R},\uplambda }} r\).

Example 2

Given a proximity relation \(\mathcal {R}\), a cut value \(\uplambda \), constants \(a\sim _{\mathcal {R},\upalpha _1}^\emptyset b\) and \(b\sim _{\mathcal {R},\upalpha _2}^\emptyset c\), binary function symbols f and h, and a unary function symbol g such that \(h \sim _{\mathcal {R},\upalpha _3}^{\{(1,1), (1,2)\}} f\) and \(h \sim _{\mathcal {R},\upalpha _4}^{\{(1,1)\}} g\) with \(\upalpha _i\ge \uplambda \), \(1\le i \le 4\), we have

  • \(h(x,\_)\precsim _{{\mathcal {R},\uplambda }} h(a,x)\), because \(h(x,x')\{x\mapsto a,x'\mapsto x \} = h(a,x)\simeq _{\mathcal {R},\uplambda } h(a,x)\).

  • \(h(x,\_)\precsim _{{\mathcal {R},\uplambda }} h(\_,x)\), because \(h(x,x')\{x\! \mapsto \! y', x'\!\mapsto \! x \} =h(y',x)\simeq _{\mathcal {R},\uplambda } h(y',x)\).

  • \(h(x,x) \not \precsim _{{\mathcal {R},\uplambda }} h(\_,x)\), because \(h(x,x) \not \precsim _{{\mathcal {R},\uplambda }} h(y',x)\).

  • \(h(x,\_)\precsim _{{\mathcal {R},\uplambda }} f(a,c)\), because \(h(x,x')\{x\mapsto b \} = h(b,x')\simeq _{\mathcal {R},\uplambda } f(a,c)\).

  • \(h(x,\_)\precsim _{{\mathcal {R},\uplambda }} g(c)\), because \(h(x,x')\{x\mapsto c \} = h(c,x')\simeq _{\mathcal {R},\uplambda } g(c)\).

The notion of syntactic generalization of a term is a special case of \(({\mathcal {R},\uplambda })\)-generalization for \(\uplambda =1\). We write \(r \precsim t\) to indicate that r is a syntactic generalization of t. Its strict part is denoted by \(\prec \).

Since \(\mathcal {R}\) is strict, \(r\precsim t\) is equivalent to \(\mathsf {deanon}(r)\sigma = \mathsf {deanon}(t)\) for some \(\sigma \) (note the syntactic equality here).

Theorem 1

If \(r\precsim t\) and \(t \precsim _{{\mathcal {R},\uplambda }} s\), then \(r\precsim _{{\mathcal {R},\uplambda }} s\).

Proof

\(r\precsim t\) implies \(\mathsf {deanon}(r)\sigma = \mathsf {deanon}(t)\) for some \(\sigma \), while from \(t \precsim _{{\mathcal {R},\uplambda }} s\) we have \(\mathsf {deanon}(t)\vartheta \simeq _{\mathcal {R},\uplambda } \mathsf {deanon}(s)\) for some \(\vartheta \). Then \(\mathsf {deanon}(r)\sigma \vartheta \simeq _{\mathcal {R},\uplambda } \mathsf {deanon}(s)\), which implies \(r\precsim _{{\mathcal {R},\uplambda }} s\).    \(\Box \)

Note that \(r\precsim _{{\mathcal {R},\uplambda }} t\) and \(t \precsim _{{\mathcal {R},\uplambda }} s\), in general, do not imply \(r\precsim _{{\mathcal {R},\uplambda }} s\) due to non-transitivity of \(\simeq _{{\mathcal {R},\uplambda }}\).

Definition 1

(Minimal complete set of \(({\mathcal {R},\uplambda })\)-generalizations). Given \(\mathcal {R}\), \(\uplambda \), \(t_1\), and \(t_2\), a set of terms T is a complete set of \(({\mathcal {R},\uplambda })\)-generalizations of \(t_1\) and \(t_2\) if

  1. (a)

    every \(r\in T\) is an \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\),

  2. (b)

    if \(r'\) is an \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\), then there exists \(r\in T\) such that \(r'\precsim r\) (note that we use syntactic generalization here).

In addition, T is minimal, if it satisfies the following property:

  1. (c)

    if \(r,r'\in T\), \(r\ne r'\), then neither \(r\prec _{{\mathcal {R},\uplambda }} r'\) nor \(r'\prec _{{\mathcal {R},\uplambda }} r\).

A minimal complete set of \(({\mathcal {R},\uplambda })\)-generalizations ((\({\mathcal {R},\uplambda }\))-mcsg) of two terms is unique modulo variable renaming. The elements of the \(({\mathcal {R},\uplambda })\)-mcsg of \(t_1\) and \(t_2\) are called least general \(({\mathcal {R},\uplambda })\)-generalizations ((\({\mathcal {R},\uplambda }\))-lggs) of \(t_1\) and \(t_2\).

This definition directly extends to generalizations of finitely many terms.

The problem of computing an \(({\mathcal {R},\uplambda })\)-generalization of terms t and s is called the \(({\mathcal {R},\uplambda })\)-anti-unification problem of t and s. In anti-unification, the goal is to compute their least general \(({\mathcal {R},\uplambda })\)-generalization.

The precise formulation of the anti-unification problem would be the following: Given \(\mathcal {R}\), \(\lambda \), \(t_1\), \(t_2\), find an \(({\mathcal {R},\uplambda })\)-lgg r of \(t_1\) and \(t_2\), substitutions \(\sigma _1\), \(\sigma _2\), and the approximation degrees \(\upalpha _1\), \(\upalpha _2\) such that \(\mathcal {R}(r\sigma _1, t_1)=\upalpha _1\) and \(\mathcal {R}(r\sigma _2, t_2)=\upalpha _2\). A minimal complete algorithm to solve this problem would compute exactly the elements of \(({\mathcal {R},\uplambda })\)-mcsg of \(t_1\) and \(t_2\) together with their approximation degrees. However, as we see below, it is problematic to solve the problem in this form. Therefore, we will consider a slightly modified variant, taking into account anonymous variables in generalizations and relaxing bounds on their degrees.

We assume that the terms to be generalized are ground. It is not a restriction because we can treat variables as constants that are close only to themselves.

Recall that the proximity class of any alphabet symbol is finite. Also, the symbols are related to each other by finitely many argument relations. One may think that it leads to finite proximity classes of terms, but this is not the case. Consider, e.g., \(\mathcal {R}\) and \(\uplambda \), where \(h \simeq _{\mathcal {R},\uplambda }^{ \{(1,1)\} } f\) with binary h and unary f. Then the \(({\mathcal {R},\uplambda })\)-proximity class of f(a) is infinite: \(\{ f(a)\} \cup \{ h(a,t) \mid t \in \mathcal {T}(\mathcal {F},\mathcal {V}) \}\). Also, the \(({\mathcal {R},\uplambda })\)-mcsg for f(a) and f(b) is infinite: \(\{ f(x) \} \cup \{ h(x,t) \mid t\in \mathcal {T}(\mathcal {F},\emptyset ) \}\).

Definition 2

Given the terms \(t_1,\ldots ,t_n\), \(n\ge 1\), a position p in a term r is called irrelevant for \(({\mathcal {R},\uplambda })\)-generalizing (resp. for \(({\mathcal {R},\uplambda })\)-proximity to) \(t_1,\ldots ,t_n\) if \(r[s]_p \precsim _{{\mathcal {R},\uplambda }} t_i\) (resp. \(r[s]_p \simeq _{{\mathcal {R},\uplambda }} t_i\)) for all \(1\le i \le n\) and for all terms s.

We say that r is a relevant \(({\mathcal {R},\uplambda })\)-generalization (resp. relevant \(({\mathcal {R},\uplambda })\)-proximal term) of \(t_1,\ldots ,t_n\) if \(r \precsim _{{\mathcal {R},\uplambda }} t_i\) (resp. \(r \simeq _{{\mathcal {R},\uplambda }} t_i\)) for all \(1\le i \le n\) and \(r|_p=\_\ \,\) for all positions p in r that is irrelevant for generalizing (resp. for proximity to) \(t_1,\ldots ,t_n\). The \(({\mathcal {R},\uplambda })\)-relevant proximity class of t is

$$\begin{aligned} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t):=\{s \mid s \text { is a relevant }({\mathcal {R},\uplambda })\text {-proximal term of } t\}. \end{aligned}$$

In the example above, position 2 in h(xt) is irrelevant for generalizing f(a) and f(b), and \(h(x,\_)\) is one of their relevant generalizations. Note that f(x) is also a relevant generalization of f(a) and f(b), since it contains no irrelevant positions. More general generalizations like, e.g., x, are relevant as well. Similarly, position 2 in h(at) is irrelevant for proximity to f(a) and \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(f(a))=\{f(a), h(a,\_) \}\). Generally, \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\) is finite for any t due to the finiteness of proximity classes of symbols and argument relations mentioned above.

Definition 3

(Minimal complete set of relevant \(({\mathcal {R},\uplambda })\)-generalizations). Given \(\mathcal {R}\), \(\uplambda \), \(t_1\), and \(t_2\), a set of terms T is a complete set of relevant \(({\mathcal {R},\uplambda })\)-generalizations of \(t_1\) and \(t_2\) if

  1. (a)

    every element of T is a relevant \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\), and

  2. (b)

    if r is a relevant \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\), then there exists \(r'\in T\) such that \(r\precsim r'\).

The minimality property is defined as in Definition 1.

This definition directly extends to relevant generalizations of finitely many terms. We use \(({\mathcal {R},\uplambda })\)-mcsrg as an abbreviation for minimal complete set of relevant \(({\mathcal {R},\uplambda })\)-generalization. Like relevant proximity classes, mcsrg’s are also finite.

Lemma 1

For given \(\mathcal {R}\) and \(\uplambda \), if all argument relations are correspondence relations, then \(({\mathcal {R},\uplambda })\)-mcsg’s and \(({\mathcal {R},\uplambda })\)-proximity classes for all terms are finite.

Proof

Under correspondence relations no term contains an irrelevant position for generalization or for proximity.    \(\Box \)

Hence, for correspondence relations the notions of mcsg and mcsrg coincide, as well as the notions of proximity class and relevant proximity class.

For a term r, we define its linearized version \(\mathsf {lin}(r)\) as a term obtained from r by replacing each occurrence of a named variable in r by a fresh one. For instance, \(\mathsf {lin}(f(x, \_, g(y,x,a), b)) = f(x', \_, g(y',x'',a), b),\) where \(x',x''\), \(y'\) are fresh variables. Linearized versions of terms are unique modulo variable renaming.

Definition 4

(Generalization degree upper bound). Given two terms r and t, a proximity relation \(\mathcal {R}\), and a \(\lambda \)-cut, the \(({\mathcal {R},\uplambda })\)-generalization degree upper bound of r and t, denoted by \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)\), is defined as follows:

Let \(\upalpha := \max \{ \mathcal {R}(\mathsf {lin}(r)\sigma , t) \mid \sigma \text { is a substitution} \}\). Then \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)\) is \(\upalpha \) if \(\upalpha \ge \lambda \), and 0 otherwise.

Intuitively, \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)=\upalpha \) means that no instance of r can get closer than \(\upalpha \) to t in \(\mathcal {R}\). From the definition it follows that if \(r\precsim _{{\mathcal {R},\uplambda }} t\), then \(0< \uplambda \leqslant \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t) \le 1\) and if \(r\not \precsim _{{\mathcal {R},\uplambda }} t\), then \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t) = 0\).

The upper bound computed by \(\mathsf {gdub}\) is more relaxed than it would be if the linearization function were not used, but this is what we will be able to compute in our algorithms later.

Example 3

Let \(\mathcal {R}(a,b)=0.6\), \(\mathcal {R}(b,c)=0.7\), and \(\uplambda =0.5\). Then \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(f(x,b), f(a,c))=0.7\) and \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(f(x,x), f(a,c))=\mathsf {gdub}_{{\mathcal {R},\uplambda }}(f(x,y), f(a,c))=1\).

It is not difficult to see that if \(r\sigma \simeq _{\mathcal {R},\uplambda } t\), then \(\mathcal {R}(r\sigma ,t)\le \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)\). In Example 3, for \(\sigma =\{x\mapsto b\}\) we have \(\mathcal {R}(f(x,x)\sigma , f(a,c))= \mathcal {R}(f(b,b), f(a,c))= 0.6 < \mathsf {gdub}_{{\mathcal {R},\uplambda }}(f(x,x), f(a,c)) = 1\).

We compute \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)\) as follows: If r is a variable, then \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)=1\). Otherwise, if \(\mathsf {head}(r) \sim _{\mathcal {R},\upbeta }^\uprho \mathsf {head}(t)\), then \( \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t) = \upbeta \wedge \bigwedge _{(i,j)\in \uprho } \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r|_i, t|_j).\) Otherwise, \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t)=0\).

3 Term Set Consistency

The notion of term set consistency plays an important role in the computation of proximal generalizations. Intuitively, a set of terms is \(({\mathcal {R},\uplambda })\)-consistent if all the terms in the set have a common \(({\mathcal {R},\uplambda })\)-proximal term. In this section, we discuss this notion and the corresponding algorithms.

Definition 5

(Consistent set of terms). A finite set of terms T is \(({\mathcal {R},\uplambda })\)-consistent if there exists a term s such that \(s \simeq _{{\mathcal {R},\uplambda }} t\) for all \(t\in T\).

\(({\mathcal {R},\uplambda })\)-consistency of a finite term set T is equivalent to \(\bigcap _{t\in T} \mathbf{pc}_{{\mathcal {R},\uplambda }}(t) \ne \emptyset \), but we cannot use this property to decide consistency, since proximity classes of terms can be infinite (when the argument relations are not restricted). For this reason, we introduce the operation \(\sqcap \) on terms as follows: (i) \(t \sqcap \_ = \_\sqcap t = t\), (ii) \(f(t_1,\ldots ,t_n)\sqcap f(s_1,\ldots ,s_n)= f(t_1\sqcap s_1,\ldots ,t_n\sqcap s_n)\), \(n\ge 0\). Obviously, \(\sqcap \) is associative (A), commutative (C), idempotent (I), and has \(\_\) as its unit element (U). It can be extended to sets of terms: \( T_1 \sqcap T_2 := \{ t_1 \sqcap t_2 \mid t_1 \in T_1, t_2 \in T_2 \}.\) It is easy to see that \(\sqcap \) on sets also satisfies the ACIU properties with the set \(\{\_ \}\) playing the role of the unit element.

Lemma 2

A finite set of terms T is \(({\mathcal {R},\uplambda })\)-consistent iff \(\sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\ne \emptyset \).

Proof

\((\Rightarrow )\) If \(s \simeq _{{\mathcal {R},\uplambda }} t\) for all \(t\in T\), then \(s_t \in \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\), where \(s_t\) is obtained from s by replacing all subterms that are irrelevant for its \(({\mathcal {R},\uplambda })\)-proximity to t by \(\_\). Assume \(T=\{t_1,\ldots ,t_n\}\). Then \(s_{t_1}\sqcap \cdots \sqcap s_{t_n}\in \sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\).

\((\Leftarrow )\) Obvious, since \(s \simeq _{{\mathcal {R},\uplambda }} t\) for \(s\in \sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\) and for all \(t\in T\).    \(\Box \)

Now we design an algorithm \(\mathfrak {C}\) that computes \(\sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\) without actually computing \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\) for each \(t\in T\). A special version of the algorithm can be used to decide the \(({\mathcal {R},\uplambda })\)-consistency of T.

The algorithm is rule-based. The rules work on states, that are pairs \(\mathbf {I};s\), where s is a term and \(\mathbf {I}\) is a finite set of expressions of the form \(x \mathop {\mathsf {\,in\,}}T\), where T is a finite set of terms. \(\mathcal {R}\) and \(\lambda \) are given. There are two rules (\(\uplus \) stands for disjoint union):

Rem: Removing the empty set

\({\{ x \mathop {\mathsf {\,in\,}}\emptyset \}\uplus \mathbf {I}; s \Longrightarrow \mathbf {I}; s\{x \mapsto \_\}.}\)

Red: Reduce a set to new sets

\({\{ x \mathop {\mathsf {\,in\,}}\{t_1,\ldots ,t_m \} \}\uplus \mathbf {I};s \Longrightarrow \{ y_1 \mathop {\mathsf {\,in\,}}T_1,\ldots , y_n \mathop {\mathsf {\,in\,}}T_n \}\cup \mathbf {I}; s\{x \mapsto h(y_1,\ldots ,y_n)\}, }\)

where \(m\ge 1\), h is an n-ary function symbol such that \(h \sim ^{\uprho _k}_{\mathcal {R},\upgamma _k} \mathsf {head}(t_k)\) with \(\upgamma _k\ge \uplambda \) for all \(1\le k \le m\), and \(T_i:= \{ t_k|_j \mid (i,j)\in \uprho _k, 1\le k \le m\}\), \(1\le i \le n\), is the set of all those arguments of the terms \(t_1,\ldots ,t_m\) that are supposed to be \(({\mathcal {R},\uplambda })\)-proximal to the i’s argument of h.

To compute \(\sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\), \(\mathfrak {C}\) starts with \(\{ x \mathop {\mathsf {\,in\,}}T \};x\) and applies the rules as long as possible. Red causes branching. A state of the form \(\emptyset ;s\) is called a success state. A failure state has the form \(\mathbf {I};s\), to which no rule applies and \(\mathbf {I}\ne \emptyset \). In the full derivation tree, each leaf is a either success or a failure state.

Example 4

Assume abc are constants, gfh are function symbols with the arities respectively 1, 2, and 3. Let \(\uplambda \) be given and \(\mathcal {R}\) be defined so that \(\mathcal {R}(a,b)\ge \uplambda \), \(\mathcal {R}(b,c)\ge \uplambda \), \(h \sim _{\mathcal {R},{\upbeta }}^{\{ (1,1),(1,2) \}} f\), \(h \sim _{\mathcal {R},{\upgamma }}^{\{ (2,1) \}} g\) with \(\upbeta \ge \uplambda \) and \(\upgamma \ge \uplambda \). Then

$$\begin{aligned} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(f(a,c)) = {}&\{ f(a,c),\, f(b,c), \, f(a,b), \, f(b,b), \, h(b,\_,\_) \}, \\ \mathbf{rpc}_{{\mathcal {R},\uplambda }}(g(a)) = {}&\{ g(a),\, g(b), h(\_,a, \_), h(\_, b, \_) \}, \end{aligned}$$

and \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(f(a,c)){\sqcap }\mathbf{rpc}_{{\mathcal {R},\uplambda }}(g(a))\! =\! \{h(b,a,\!\_), h(b,b,\!\_) \}. \) We show how to compute this set with \(\mathfrak {C}\): \(\{x\,\mathsf {in}\, \{f(a,c), g(a)\}\}; \,x \Longrightarrow _{\textsf {Red}} \{ y_1\,\mathsf {in}\, \{ a, c \}, y_2: \{ a \}, y_3 \mathop {\mathsf {\,in\,}}\emptyset \};\) \( h(y_1,y_2,y_3) \Longrightarrow _{\textsf {Rem}} \{ y_1 \;\mathsf {in}\; \{ a, c \},\; y_2: \{ a \} \};\; h(y_1,y_2,\_) \Longrightarrow _{\textsf {Red}} \{ y_2 \;\mathsf {in}\; \{ a \} \}; h(b,y_2,\_).\) Here we have two ways to apply Red to the last state, leading to two elements of \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(f(a,c)) \sqcap \mathbf{rpc}_{{\mathcal {R},\uplambda }}(g(a))\): \(h(b,a,\_)\) and \(h(b,b,\_)\).

Theorem 2

Given a finite set of terms T, the algorithm \(\mathfrak {C}\) always terminates starting from the state \(\{x \mathop {\mathsf {\,in\,}}T\};x\) (where x is a fresh variable). If S is the set of success states produced at the end, we have \(\{ s \mid \emptyset ;s\in S \} = \sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\).

Proof

Termination: Associate to each state \(\{ x_1 \mathop {\mathsf {\,in\,}}T_1,\ldots x_n \mathop {\mathsf {\,in\,}}T_n \};s\) the multiset \(\{d_1,\ldots ,d_n\}\), where \(d_i\) is the maximum depth of terms occurring in \(T_i\). \(d_i=0\) if \(T_i=\emptyset \). Compare these multisets by the Dershowitz-Manna ordering [5]. Each rule strictly reduces them, which implies termination.

By the definitions of \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}\) and \(\sqcap \), \(h(s_1,\ldots ,s_n)\in \sqcap _{t\in \{t_1,\ldots ,t_m\}} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\) iff \(h \sim ^{\uprho _k}_{\mathcal {R},\upgamma _k} \mathsf {head}(t_k)\) with \(\upgamma _k\ge \uplambda \) for all \(1\le k \le m\) and \(s_i \in \sqcap _{t\in T_i} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\), where \(T_i= \{ t_k|_j \mid (i,j)\in \uprho _k, 1\le k \le m\}\), \(1\le i \le n\). Therefore, in the Rem rule, the instance of x (which is \(h(y_1,\ldots ,y_n)\)) is in \(\sqcap _{t\in \{t_1,\ldots ,t_m\}} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\) iff for each \(1\le i \le n\) we can find an instance of \(y_i\) in \(\sqcap _{t\in T_i} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\). If \(T_i\) is empty, it means that the i’s argument of h is irrelevant for terms in \(\{t_1,\ldots ,t_m\}\) and can be replaced by \(\_\). (Rem does it in a subsequent step.) Hence, in each success branch of the derivation tree, the algorithm \(\mathfrak {C}\) computes one element of \(\sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\). Branching at Red helps produce all elements of \(\sqcap _{t\in T} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t)\).    \(\Box \)

It is easy to see how to use \(\mathfrak {C}\) to decide the \(({\mathcal {R},\uplambda })\)-consistency of T: it is enough to find one successful branch in the \(\mathfrak {C}\)-derivation tree for \(\{x\mathop {\mathsf {\,in\,}}T\};x\). If there is no such branch, then T is not \(({\mathcal {R},\uplambda })\)-consistent. In fact, during the derivation we can even ignore the second component of the states.

4 Solving Generalization Problems

Now we can reformulate the anti-unification problem that will be solved in the remaining part of the paper. \(\mathcal {R}\) is a proximity relation and \(\uplambda \) is a cut value.

Given: \(\mathcal {R}\), \(\uplambda \), and the ground terms \(t_1,\ldots ,t_n\), \(n\ge 2\).

Find: a set \(\mathsf {S}\) of tuples \((r,\sigma _1,\ldots ,\sigma _n,\upalpha _1,\ldots ,\upalpha _n)\) such that

  • \(\{ r \mid (r,\ldots ) \in \mathsf {S}\}\) is an \(({\mathcal {R},\uplambda })\)-mcsrg of \(t_1,\ldots ,t_n\),

  • \(r\sigma _i \simeq _{{\mathcal {R},\uplambda }} t_i\) and \(\upalpha _i=\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t_i)\), \(1\le i \le n\), for each \((r,\sigma _1,\ldots ,\sigma _n,\upalpha _1,\ldots ,\upalpha _n) \in \mathsf {S}\).

(When \(n=1\), this is a problem of computing a relevant proximity class of a term.) Below we give a set of rules, from which one can obtain algorithms to solve the anti-unification problem for four versions of argument relations:

  1. 1.

    The most general (unrestricted) case; see algorithm \(\mathfrak {A}_1\) below, the computed set of generalizations is an mcsrg;

  2. 2.

    Correspondence relations: using the same algorithm \(\mathfrak {A}_1\), the computed set of generalizations is an mcsg;

  3. 3.

    Mappings: using a dedicated algorithm \(\mathfrak {A}_2\), the computed set of generalizations is an mcsrg;

  4. 4.

    Correspondence mappings (bijections): using the same algorithm \(\mathfrak {A}_2\), the computed set of generalizations is an mcsg.

Each of them has also the corresponding linear variant, computing minimal complete sets of (relevant) linear \(({\mathcal {R},\uplambda })\)-generalizations. They are denoted by adding the superscript \(\mathsf {lin}\) to the corresponding algorithm name: \(\mathfrak {A}^\mathsf {lin}_1\) and \(\mathfrak {A}^\mathsf {lin}_2\).

For simplicity, we formulate the algorithms for the case \(n=2\). They can be extended for arbitrary n straightforwardly.

The main data structure in these algorithms is an anti-unification triple (AUT) \(x:T_1\triangleq T_2\), where \(T_1\) and \(T_2\) are finite consistent sets of ground terms. The idea is that x is a common generalization of all terms in \(T_1\cup T_2\). A configuration is a tuple \(A;S;r;\upalpha _1;\upalpha _2\), where A is a set of AUTs to be solved, S is a set of solved AUTs (the store), r is the generalization computed so far, and the \(\upalpha \)’s are the current approximations of generalization degree upper bounds of r for the input terms.

Before formulating the rules, we discuss one peculiarity of approximate generalizations:

Example 5

For a given \(\mathcal {R}\) and \(\lambda \), assume \(\mathcal {R}(a,b) \ge \lambda \), \(\mathcal {R}(b,c) \ge \lambda \), \(h \sim _{\mathcal {R},\upalpha }^{\{(1,1),(1,2)\}} f\) and \(h \sim _{\mathcal {R},\upbeta }^{\{(1,1)\}} g\), where f is binary, gh are unary, \(\upalpha \ge \lambda \) and \(\upbeta \ge \lambda \). Then

  • h(b) is an \(({\mathcal {R},\uplambda })\)-generalization of f(ac) and g(a).

  • x is the only \(({\mathcal {R},\uplambda })\)-generalization of f(ad) and g(a). One may be tempted to have h as the head of the generalization, e.g., h(x), but x cannot be instantiated by any term that would be \(({\mathcal {R},\uplambda })\)-close to both a and d, since in the given \(\mathcal {R}\), d is \(({\mathcal {R},\uplambda })\)-close only to itself. Hence, there would be no instance of h(x) that is \(({\mathcal {R},\uplambda })\)-close to f(ad). Since there is no other alternative (except h) for the common neighbor of f and g, the generalization should be a fresh variable x.

This example shows that generalization algorithms should take into account not only the heads of the terms to be generalized, but also should look deeper, to make sure that the arguments grouped together by the given argument relation have a common neighbor. This justifies the requirement of consistency of a set of arguments, the notion introduced in the previous section and used in the decomposition rule below.

4.1 Anti-unification for Unrestricted Argument Relations

Algorithms \(\mathfrak {A}^\mathsf {lin}_1\) and \(\mathfrak {A}_1\) use the rules below to transform configurations into configurations. Given \(\mathcal {R}\), \(\uplambda \), and the ground terms \(t_1\) and \(t_2\), we create the initial configuration \(\{x:\{t_1\} \triangleq \{t_2 \}\}; \emptyset ; x;1;1\) and apply the rules as long as possible. Note that the rules preserve consistency of AUTs. The process generates a finite complete tree of derivations, whose terminal nodes have configurations with the first component empty. We will show how from these terminal configurations one collects the result as required in the anti-unification problem statement.

Tri: Trivial

\({\{x: \emptyset \triangleq \emptyset \}\uplus A;\, S;\,r;\,\upalpha _1;\upalpha _2\Longrightarrow A;\, S;\,r\{x\mapsto \_\};\, \upalpha _1;\upalpha _2.}\)

Dec: Decomposition

$$\begin{aligned}&\{x: T_1\triangleq T_2 \}\uplus A; S;r;\upalpha _1;\upalpha _2 \Longrightarrow \\&\qquad \{ y_i : Q_{i1}\triangleq Q_{i2} \mid 1\le i \le n \}\cup A; S; r \{x\mapsto h(y_1,\ldots ,y_n)\}; \upalpha _1\wedge \upbeta _1;\upalpha _2\wedge \upbeta _2, \end{aligned}$$

where \(T_1\cup T_2 \ne \emptyset \); h is n-ary with \(n\ge 0\); \(y_1,\ldots ,y_n\) are fresh; and for \(j=1,2\), if \(T_j= \{t^j_1,\ldots , t^j_{m_j} \}\), then

  • \(h \sim _{\mathcal {R},\upgamma ^j_k}^{\uprho ^j_k} \mathsf {head}(t^j_k)\) with \(\upgamma ^j_k\ge \uplambda \) for all \(1\le k \le {m_j}\) and \(\upbeta _j = \upgamma ^j_1 \wedge \cdots \wedge \upgamma ^j_{m_j}\) (note that \(\upbeta _j=1\) if \({m_j}=0\)),

  • for all \(1\le i \le n\), \(Q_{ij} = \cup _{k=1}^{m_j}\{ t^j_k|_q \mid (i,q) \in \uprho ^j_k\}\) and is \(({\mathcal {R},\uplambda })\)-consistent.

Sol: Solving

\({\{x:T_1 \triangleq T_2 \}\uplus A;\, S;\,r;\,\upalpha _1; \upalpha _2 \Longrightarrow A;\, \{x: T_1 \triangleq T_2\}\cup S;\,r;\,\upalpha _1;\upalpha _2,}\)

if Tri and Dec rules are not applicable. (It means that at least one \(T_i\ne \emptyset \) and either there is no h as it is required in the Dec rule, or at least one \(Q_{ij}\) from Dec is not \(({\mathcal {R},\uplambda })\)-consistent.)

Let \(\mathsf {expand}\) be an expansion operation defined for sets of AUTs as

$$\begin{aligned} \mathsf {expand}(S):= \{ x: \mathop {\sqcap }\limits _{t\in T_1} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t) \triangleq \mathop {\sqcap }\limits _{t\in T_2} \mathbf{rpc}_{{\mathcal {R},\uplambda }}(t) \mid x : T_1 \triangleq T_2 \in S \}. \end{aligned}$$

Exhaustive application of the three rules above leads to configurations of the form \(\emptyset ; S; r; \upalpha _1;\upalpha _2\), where r is a linear term. These configurations are further postprocessed, replacing S by \(\mathsf {expand}(S)\). We will use the letter E for expanded stores. Hence, terminal configurations obtained after the exhaustive rule application and expansion have the form \(\emptyset ; E; r; \upalpha _1;\upalpha _2\), where r is a linear term.Footnote 1 This is what Algorithm \(\mathfrak {A}^\mathsf {lin}_1\) stops with.

To an expanded store \(E=\{y_1 : Q_{11} \triangleq Q_{12}, \ldots , y_n : Q_{n1} \triangleq Q_{n2} \}\) we associate two sets of substitutions \(\varSigma _L(E)\) and \(\varSigma _R(E)\), defined as follows: \(\sigma \in \varSigma _L(E)\) (resp. \(\sigma \in \varSigma _R(E)\)) iff \(\mathsf {dom}(\sigma )= \{ y_1, \ldots , y_n\}\) and \(y_i\sigma \in Q_{i1}\) (resp. \(y_i\sigma \in Q_{i2}\)) for each \(1\le i \le n\). We call them the sets of witness substitutions.

Configurations containing expanded stores are called expanded configurations. From each expanded configuration \(C=\emptyset ; E; r; \upalpha _1;\upalpha _2\), we construct the set \(\mathsf {S}(C):=\{ (r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2) \mid \sigma _1 \in \varSigma _L(E), \sigma _2 \in \varSigma _R(E) \}\).

Given an anti-unification problem \(\mathcal {R}\), \(\uplambda \), \(t_1\) and \(t_2\), the answer computed by Algorithm \(\mathfrak {A}^\mathsf {lin}_1\) is the set \(\mathsf {S} := \cup _{i=1}^m \mathsf {S}(C_i)\), where \(C_1,\ldots ,C_m\) are all of the final expanded configurations reached by \(\mathfrak {A}^\mathsf {lin}_1\) for \(\mathcal {R}\), \(\uplambda \), \(t_1\), and \(t_2\).Footnote 2

Example 6

Assume abc and d are constants with \(b \sim _{\mathcal {R},0.5}^\emptyset c\), \(c \sim _{\mathcal {R},0.6}^\emptyset d\), and f, g and h are respectively binary, ternary and quaternary function symbols with \(h \sim _{\mathcal {R},0.7}^{ \{ (1,1), (3,2), (4,2) \} } f\) and \(h \sim _{\mathcal {R},0.8}^{ \{ (1,1), (3,3) \} } g\). For the proximity relation \(\mathcal {R}\) given in this way and \(\uplambda =0.5\), Algorithm \(\mathfrak {A}^\mathsf {lin}_1\) performs the following steps to anti-unify f(ab) and g(acd):

$$\begin{aligned}&\{ x: \{f(a,b)\} \triangleq \{ g(a,c,d) \} \};\emptyset ;x;1;1 \Longrightarrow _{\textsf {Dec}}\\&\{ x_1 : \{a\} \triangleq \{a \}, \, x_2 : \emptyset \triangleq \emptyset , \, x_3 : \{b\} \triangleq \{ d \}, \\&\qquad x_4 : \{b\} \triangleq \emptyset \}; \emptyset ; h(x_1,x_2,x_3,x_4); 0.7; 0.8 \Longrightarrow _{\textsf {Dec}}\\&\{ x_2 : \emptyset \triangleq \emptyset , \, x_3 : \{b\} \triangleq \{ d \}, \, x_4 : \{b\} \triangleq \emptyset \}; \emptyset ; h(a,x_2,x_3,x_4); 0.7; 0.8 \Longrightarrow _{\textsf {Tri}}\\&\{ x_3 : \{b\} \triangleq \{ d \}, \, x_4 : \{b\} \triangleq \emptyset \}; \emptyset ; h(a,\_,x_3,x_4); 0.7; 0.8 \Longrightarrow _{\textsf {Dec}}\\&\{ x_4 : \{b\} \triangleq \emptyset \};\emptyset ; h(a,\_,c,x_4); 0.5; 0.6. \end{aligned}$$

Here Dec applies in two different ways, with the substitutions \(\{x_4 \mapsto b \}\) and \(\{x_4 \mapsto c \}\), leading to two final configurations: \(\emptyset ; \emptyset ; h(a,\_,c,b); 0.5; 0.6\) and \(\emptyset ; \emptyset ; h(a,\_,c,c); 0.5; 0.6\). The witness substitutions are the identity substitutions. We have \(\mathcal {R}(h(a,\_,c,b),f(a,b))= 0.5\), \(\mathcal {R}(h(a,\_,c,b),g(a,c,d))= 0.6\), \(\mathcal {R}(h(a,\_,c,c),f(a,b))= 0.5\), and \(\mathcal {R}(h(a,\_,c,c), g(a,c,d))= 0.6\).

If we had \(h \sim _{\mathcal {R},0.7}^{ \{ (1,1), (1,2), (4,2) \} } f\), then the algorithm would perform only the Sol step, because in the attempt to apply Dec to the initial configuration, the set \(Q_{11}=\{a,b\}\) is inconsistent: \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(a)=\{a\}\), \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(b)=\{b,c\}\), and, hence, \(\mathbf{rpc}_{{\mathcal {R},\uplambda }}(a) \sqcap \mathbf{rpc}_{{\mathcal {R},\uplambda }}(b)=\emptyset \).

Algorithm \(\mathfrak {A}_1\) is obtained by further transforming the expanded configurations produced by \(\mathfrak {A}^\mathsf {lin}_1\). This transformation is performed by applying the Merge rule below as long as possible. Intuitively, its purpose is to make the linear generalization obtained by \(\mathfrak {A}^\mathsf {lin}_1\) less general by merging some variables.

Mer: Merge

$$\begin{aligned}&\emptyset ; \{ x_1 : R_{11} \triangleq R_{12}, \, x_2 : R_{21} \triangleq R_{22} \}\uplus E;\,r;\,\upalpha _1;\,\upalpha _2\Longrightarrow \\&\qquad \emptyset ;\, \{y:Q_1 \triangleq Q_2\}\cup E;\,r\sigma ; \,\upalpha _1;\,\upalpha _2, \end{aligned}$$

where \(Q_i = (R_{1i}\sqcap R_{2i})\ne \emptyset \), \(i=1,2\), y is fresh, and \(\sigma = \{x_1\mapsto y, \, x_2 \mapsto y\}\).

The answer computed by \(\mathfrak {A}_1\) is defined similarly to the answer computed by \(\mathfrak {A}^\mathsf {lin}_1\).

Example 7

Assume ab are constants, \(f_1\), \(f_2\), \(g_1\), and \(g_2\) are unary function symbols, p is a binary function symbol, and \(h_1\) and \(h_2\) are ternary function symbols. Let \(\uplambda \) be a cut value and \(\mathcal {R}\) be defined as \(f_i \sim ^{\{(1,1)\}}_{\mathcal {R},\upalpha _i} h_i\) and \(g_i \sim ^{\{(1,2)\}}_{\mathcal {R},\upbeta _i} h_i\) with \(\upalpha _i\ge \uplambda \), \(\upbeta _i\ge \uplambda \), \(i=1,2\). To generalize \(p(f_1(a), g_1(b))\) and \(p(f_2(a), g_2(b))\), we use \(\mathfrak {A}_1\). The derivation starts as

$$\begin{aligned}&\{x : \{ p(f_1(a), g_1(b)) \} \triangleq \{ p(f_2(a), g_2(b))\} \};\, \emptyset ; \, x;\, 1;\, 1 \Longrightarrow _{\textsf {Dec}} \\&\{y_1 : \{ f_1(a) \} \triangleq \{ f_2(a)\}, \,y_2 : \{ g_1(b) \}\triangleq \{ g_2(b) \} \};\, \emptyset ; \, p(y_1,y_2);\, 1;\, 1 \Longrightarrow _{\textsf {Sol}}^2 \\&\emptyset ;\,\{y_1 : \{ f_1(a) \} \triangleq \{ f_2(a)\}, \,y_2 : \{ g_1(b)\} \triangleq \{g_2(b) \} \};\, p(y_1,y_2);\, 1;\, 1. \end{aligned}$$

At this stage, we expand the store, obtaining

$$\begin{aligned} \emptyset ;\,&\{y_1 : \{ f_1(a), h_1(a, \_,\_) \} \triangleq \{ f_2(a), h_2(a,\_,\_)\}, \\&\, y_2 : \{ g_1(b), h_1(\_, b, \_)\} \triangleq \{g_2(b), h_2(\_, b, \_) \} \};\, p(y_1,y_2);\, 1;\, 1. \end{aligned}$$

If we had the standard intersection \(\cap \) in the Mer rule, we would not be able to merge \(y_1\) and \(y_2\), because the obtained sets in the corresponding AUTs are disjoint. However, Mer uses \(\sqcap \): we have \(\{ f_i(a), h_i(a, \_,\_) \} \sqcap \{ g_i(b), h_i(\_, b, \_)\} = \{ h_i(a, b, \_) \}\), \(i=1,2\) and, therefore, can make the step

$$\begin{aligned} \emptyset ;\,&\{y_1 : \{ f_1(a), h_1(a, \_,\_) \} \triangleq \{ f_2(a), h_2(a,\_,\_)\}, \\&\, y_2 : \{ g_1(b), h_1(\_, b, \_)\} \triangleq \{g_2(b), h_2(\_, b, \_) \} \};\, p(y_1,y_2);\, 1;\, 1 \Longrightarrow _{\textsf {Mer}} \\&\emptyset ;\, \{z : \{ h_1(a, b, \_)\} \triangleq \{h_2(a, b, \_) \} \};\, p(z,z);\, 1;\, 1. \end{aligned}$$

Indeed, if we take the witness substitutions \(\sigma _i= \{ z \mapsto h_i(a, b, \_)\}\), \(i=1,2\), and apply them to the obtained generalization, we get

$$\begin{aligned}&p(z,z)\sigma _1 = p(h_1(a, b, \_), h_1(a, b, \_)) \simeq _{\mathcal {R},\uplambda } p(f_1(a), g_1(b)), \\&p(z,z)\sigma _2 = p(h_2(a, b, \_), h_2(a, b, \_)) \simeq _{\mathcal {R},\uplambda } p(f_2(a), g_2(b)). \end{aligned}$$

Theorem 3

Given \(\mathcal {R}\), \(\uplambda \), and the ground terms \(t_1\) and \(t_2\), Algorithm \(\mathfrak {A}_1\) terminates for \(\{x:\{t_1\} \triangleq \{t_2 \}\};\emptyset ;x;1;1\) and computes an answer set \(\mathsf {S}\) such that

  1. 1.

    the set \(\{ r \mid (r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2) \in \mathsf {S} \}\) is an \(({\mathcal {R},\uplambda })\)-mcsrg of \(t_1\) and \(t_2\),

  2. 2.

    for each \((r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2) \in \mathsf {S}\) we have \(\mathcal {R}(r\sigma _i, t_i) \le \upalpha _i = \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t_i)\), \(i=1,2\).

Proof

Termination: Define the depth of an AUT \(x:\{t_1, \ldots , t_m\} \triangleq \{ s_1,\ldots , s_n\}\) as the depth of the term \(f(g(t_1,\ldots ,t_m), h(s_1,\ldots ,s_n))\). The rules Tri, Dec, and Sol strictly reduce the multiset of depths of AUTs in the first component of the configurations. Mer strictly reduces the number of distinct variables in generalizations. Hence, these rules cannot be applied infinitely often and \(\mathfrak {A}_1\) terminates.

In order to prove (1), we need to verify three properties:

  • Soundness: If \((r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2)\in \mathsf {S}\), then r is a relevant \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\).

  • Completeness: If \(r'\) is a relevant \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\), then there exists \((r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2)\in \mathsf {S}\) such that \(r' \precsim r\).

  • Minimality: If r and \(r'\) belong to two tuples from \(\mathsf {S}\) such that \(r\ne r'\), then neither \(r \prec _{{\mathcal {R},\uplambda }} r'\) nor \(r'\prec _{{\mathcal {R},\uplambda }} r\).

Soundness: We show that each rule transforms an \(({\mathcal {R},\uplambda })\)-generalization into an \(({\mathcal {R},\uplambda })\)-generalization. Since we start from a most general \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\) (a fresh variable x), at the end of the algorithm we will get an \(({\mathcal {R},\uplambda })\)-generalization of \(t_1\) and \(t_2\). We also show that in this process all irrelevant positions are abstracted by anonymous variables, to guarantee that each computed generalization is relevant.

Dec: The computed h is \(({\mathcal {R},\uplambda })\)-close to the head of each term in \(T_1\cup T_2\). \(Q_{ij}\)’s correspond to argument relations between h and those heads, and each \(Q_{ij}\) is \(({\mathcal {R},\uplambda })\)-consistent, i.e., there exists a term that is \(({\mathcal {R},\uplambda })\)-close to each term in \(Q_{ij}\). It implies that \(x\sigma = h(y_1,\ldots ,y_n)\) \(({\mathcal {R},\uplambda })\)-generalizes all the terms from \(T_1\cup T_2\). Note that at this stage, \(h(y_1,\ldots ,y_n)\) might not yet be a relevant \(({\mathcal {R},\uplambda })\)-generalization of \(T_1\) and \(T_2\): if there exists an irrelevant position \(1\le i\le n\) for the \(({\mathcal {R},\uplambda })\)-generalization of \(T_1\) and \(T_2\), then in the new configuration we will have an AUT \(y_i : \emptyset \triangleq \emptyset \).

Tri: When Dec generates \(y : \emptyset \triangleq \emptyset \), the Tri rule replaces y by \(\_\) in the computed generalization, making it relevant.

Sol does not change generalizations.

Mer merges AUTs whose terms have nonempty intersection of \(\mathbf{rpc}\)’s. Hence, we can reuse the same variable in the corresponding positions in generalizations, i.e., Mer transforms a generalization computed so far into a less general one.

Completeness: We prove a slightly more general statement. Given two finite consistent sets of ground terms \(T_1\) and \(T_2\), if \(r'\) is a relevant \(({\mathcal {R},\uplambda })\)-generalization for all \(t_1\in T_1\) and \(t_2\in T_2\), then starting from \(\{x: T_1 \triangleq T_2\};\emptyset ;x;1;1\), Algorithm \(\mathfrak {A}_1\) computes a \((r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2)\) such that \(r'\precsim r\).

We may assume w.l.o.g. that \(r'\) is a relevant \(({\mathcal {R},\uplambda })\)-lgg. Due to the transitivity of \(\precsim \), completeness for such an \(r'\) will imply it for all terms more general than \(r'\).

We proceed by structural induction on \(r'\). If \(r'\) is a (named or anonymous) variable, the statement holds. Assume \(r'=h(r'_1,\ldots ,r'_n)\), \(T_1= \{u_1,\ldots ,u_m\}\), and \(T_2 = \{w_1,\ldots ,w_l\}\). Then h is such that \(h \sim _{\mathcal {R},{\upbeta _i}}^{\uprho _i} \mathsf {head}(u_i)\) for all \(1\le i \le m\) and \(h \sim _{\mathcal {R},{\upgamma _j}}^{\upmu _j} \mathsf {head}(w_j)\) for all \(1\le j \le l\). Moreover, each \(r'_k\) is a relevant \(({\mathcal {R},\uplambda })\)-generalization of \(Q_{k1}=\cup _{i=1}^m\{ u_i|_q \mid (k,q) \in \uprho _i \}\) and \(Q_{k2}=\cup _{j=1}^l\{ w_j|_q \mid (k,q) \in \upmu _j \}\) and, hence, \(Q_{k1}\) and \(Q_{k2}\) are \(({\mathcal {R},\uplambda })\)-consistent. Therefore, we can perform a step by Dec, choosing \(h(y_1,\ldots ,y_k)\) as the generalization term and \(y_i:Q_{i1}\triangleq Q_{i2}\) as the new AUTs. By the induction hypothesis, for each \(1\le i \le n\) we can compute a relevant \(({\mathcal {R},\uplambda })\)-generalization \(r_i\) for \(Q_{i_1}\) and \(Q_{i2}\) such that \(r'_i\precsim r_i\).

If \(r'\) is linear, then the combination of the current Dec step with the derivations that lead to those \(r_i\)’s computes a tuple \((r,\ldots ) \in \mathsf {S}\), where \(r=h(r_1,\ldots ,r_n)\) and, hence, \(r'\precsim r\).

If \(r'\) is non-linear, assume without loss of generality that all occurrences of a shared variable z appear as the direct arguments of h: \(z=r'_{k_1} = \cdots = r'_{k_p}\) for \(1\le k_1< \cdots < k_p \le n\). Since \(r'\) is an lgg, \(Q_{k_i1}\) and \(Q_{k_i2}\) cannot be generalized by a non-variable term, thus, Tri and Dec are not applicable. Therefore, the AUTs \(y_i : Q_{k_i1}\triangleq Q_{k_i2}\) would be transformed by Sol. Since all pairs \(Q_{k_i1}\) and \(Q_{k_i2}\), \(1\le i \le p\), are generalized by the same variable, we have \(\sqcap _{t\in Q_j}\mathbf{rpc}_{{\mathcal {R},\uplambda }}(t) \ne \emptyset \), where \(Q_j=\cup _{i=1}^p Q_{k_ij}\), \(j=1,2\). Additionally, \(r'_{k_1},\ldots ,r'_{k_p}\) are all occurrences of z in \(r'\). Hence, the condition of Mer is satisfied and we can extend our derivation with \(p-1\)-fold application of this rule, obtaining \(r=h(r_1,\ldots ,r_n)\) with \(z=r_{k_1} = \cdots = r_{k_p}\), implying \(r' \precsim r\).

Minimality: Alternative generalizations are obtained by branching in Dec or Mer. If the current generalization r is transformed by Dec into two generalizations \(r_1\) and \(r_2\) on two branches, then \(r_1=h_1(y_1,\ldots ,y_m)\) and \(r_2=h_2(z_1,\ldots ,z_n)\) for some h’s, and fresh y’s and z’s. It may happen that \(r_1 \precsim _{{\mathcal {R},\uplambda }} r_2\) or vice versa (if \(h_1\) and \(h_2\) are \(({\mathcal {R},\uplambda })\)-close to each other), but neither \(r_1\prec _{{\mathcal {R},\uplambda }} r_2\) nor \(r_2\prec _{{\mathcal {R},\uplambda }} r_1\) holds. Hence, the set of generalizations computed before applying Mer is minimal. Mer groups AUTs together maximally, and different groupings are not comparable. Therefore, variables in generalizations are merged so that distinct generalizations are not \(\prec _{{\mathcal {R},\uplambda }}\)-comparable. Hence, (1) is proven.

As for (2), for \(i=1,2\), from the construction in Dec follows \(\mathcal {R}(r\sigma _i, t_i) \le \upalpha _i\). Mer does not change \(\upalpha _i\), thus, \(\upalpha _i = \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t_i)\) also holds, since the way how \(\upalpha _i\) is computed corresponds exactly to the computation of \(\mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t_i)\): \(r \precsim _{{\mathcal {R},\uplambda }} t_i\) and only the decomposition changes the degree during the computation.   \(\Box \)

The corollary below is proved similarly to Theorem 3:

Corollary 1

Given \(\mathcal {R}\), \(\uplambda \), and the ground terms \(t_1\) and \(t_2\), Algorithm \(\mathfrak {A}^\mathsf {lin}_1\) terminates for \(\{x:\{t_1\} \triangleq \{t_2 \}\};\emptyset ;x;1;1\) and computes an answer set \(\mathsf {S}\) such that

  1. 1.

    the set \(\{ r \mid (r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2) \in \mathsf {S} \}\) is a minimal complete set of relevant linear \(({\mathcal {R},\uplambda })\)-generalizations of \(t_1\) and \(t_2\),

  2. 2.

    for each \((r,\sigma _1,\sigma _2,\upalpha _1,\upalpha _2) \in \mathsf {S}\) we have \(\mathcal {R}(r\sigma _i, t_i) \le \upalpha _i = \mathsf {gdub}_{{\mathcal {R},\uplambda }}(r,t_i)\), \(i=1,2\).

4.2 Anti-unification with Correspondence Argument Relations

Correspondence relations make sure that for a pair of proximal symbols, no argument is irrelevant for proximity. Left- and right-totality of those relations guarantee that each argument of a term is close to at least one argument of its proximal term and the inverse relation remains a correspondence relation. Consequently, in the Dec rule of \(\mathfrak {A}_1\), the sets \(Q_{ij}\) never get empty. Therefore, the Tri rule becomes obsolete and no anonymous variable appears in generalizations. As a result, the \(({\mathcal {R},\uplambda })\)-mcsrg and the \(({\mathcal {R},\uplambda })\)-mcsg coincide, and the algorithm computes a solution from which we get an \(({\mathcal {R},\uplambda })\)-mcsg for the given anti-unification problem. The linear version \(\mathfrak {A}^\mathsf {lin}_1\) works analogously.

4.3 Anti-unification with Argument Mappings

When the argument relations are mappings, we are able to design a more constructive method for computing generalizations and their degree bounds (Recall that our mappings are partial injective functions, which guarantees that their inverses are also mappings.) We denote this algorithm by \(\mathfrak {A}_2\). The configurations stay the same as in before, but the AUTs in A will contain only empty or singleton sets of terms. In the store, we may still get (after the expansion) AUTs with term sets containing more than one element. Only the Dec rule differs from its previous counterpart, having a simpler condition:

Dec: Decomposition

$$\begin{aligned}&\{x: T_1\triangleq T_2 \}\uplus A; S;r;\upalpha _1;\upalpha _2 \Longrightarrow \nonumber \\&\qquad \{ y_i : Q_{i1}\triangleq Q_{i2} \mid 1\le i \le n \}\cup A; S; r \{x\mapsto h(y_1,\ldots ,y_n)\}; \upalpha _1\wedge \upbeta _1;\upalpha _2\wedge \upbeta _2, \end{aligned}$$

where \(T_1\cup T_2 \ne \emptyset \); h is n-ary with \(n\ge 0\); \(y_1,\ldots , y_n\) are fresh; for \(j=1,2\) and for all \(1\le i \le n\), if \(T_j= \{t_j\}\) then \(h \sim _{\mathcal {R},\upbeta _j}^{\uppi _j} \mathsf {head}(t_j)\) and \(Q_{ij}= \{ t_j|_{\uppi _j(i)}\}\), and if \(T_j= \emptyset \) then \(\upbeta _j=1\) and \(Q_{ij}= \emptyset \).

This Dec rule is equivalent to the special case of Dec for argument relations where \(m_j\le 1\). The new \(Q_{ij}\)’s contain at most one element (due to mappings) and, thus, are always \(({\mathcal {R},\uplambda })\)-consistent. Various choices of h in Dec and alternatives in grouping AUTs in Mer cause branching in the same way as in \(\mathfrak {A}_1\). It is easy to see that the counterparts of Theorem 3 hold for \(\mathfrak {A}_2\) and \(\mathfrak {A}^\mathsf {lin}_2\) as well.

A special case of this fragment of anti-unification is anti-unification for similarity relations in fully fuzzy signatures from [1]. Similarity relations are min-transitive proximity relations. The position mappings in [1] can be modeled by our argument mappings, requiring them to be total for symbols of the smaller arity and to satisfy the similarity-specific consistency restrictions from [1].

4.4 Anti-unification with Correspondence Argument Mappings

Correspondence argument mappings are bijections between arguments of function symbols of the same arity. For such mappings, if \(h \simeq _{\mathcal {R},\uplambda }^\uppi f\) and h is n-ary, then f is also n-ary and \(\uppi \) is a permutation of \((1,\ldots ,n)\). Hence, \(\mathfrak {A}_2\) combines in this case the properties of \(\mathfrak {A}_1\) for correspondence relations (Sect. 4.2) and of \(\mathfrak {A}_2\) for argument mappings (Sect. 4.3): all generalizations are relevant, computed answer gives an mcsg of the input terms, and the algorithm works with term sets of cardinality at most 1.

5 Remarks About the Complexity

The proximity relation \(\mathcal {R}\) can be naturally represented as an undirected graph, where the vertices are function symbols and an edge between them indicates that they are proximal. Graphs induced by proximity relations are usually sparse. Therefore we can represent them by (sorted) adjacency lists. In the adjacency lists, we can also accommodate the argument relations and proximity degrees.

In the rest of this section we use the following notation:

  • n: the size of the input (number of symbols) of the corresponding algorithms,

  • \(\Delta \): the maximum degree of \(\mathcal {R}\) considered as a graph,

  • \(\mathfrak {a}\): the maximum arity of function symbols that occur in \(\mathcal {R}\).

  • \( m^ {\bullet n} \): a function defined on natural numbers m and n such that \( 1^ {\bullet n} =n\) and \( m^ {\bullet n} =m^n\) for \(m\ne 1\).

We assume that the given anti-unification problem is represented as a completely shared directed acyclic graph (dag). Each node of the dag has a pointer to the adjacency list (with respect to \(\mathcal {R}\)) of the symbol in the node.

Theorem 4

Time complexities of \(\mathfrak {C}\) and the linear versions of the generalization algorithms are as follows:

  • \(\mathfrak {C}\) for argument relations and \(\mathfrak {A}^\mathsf {lin}_1\): \(O(n\cdot \varDelta \cdot \varDelta ^ {\bullet \mathfrak {a}^ {\bullet n} } )\),

  • \(\mathfrak {C}\) for argument mappings and \(\mathfrak {A}^\mathsf {lin}_2\): \(O(n\cdot \varDelta \cdot \varDelta ^ {\bullet n} )\).

Proof

(Sketch). In \(\mathfrak {C}\), in the case of argument relations, an application of the Red rule to a state \(\mathbf {I};s\) replaces one element of \(\mathbf {I}\) of size m by at most \(\mathfrak {a}\) new elements, each of them of size \(m-1\). Hence, one branch in the search tree for \(\mathfrak {C}\), starting from a singleton set \(\mathbf {I}\) of size n, will have the length at most \(l=\sum _{i=0}^{n-1}\mathfrak {a}^i\). At each node on it there are at most \(\varDelta \) choices of applying Red with different h’s, which gives the total size of the search tree to be at most \(\sum _{i=0}^{l-1}\varDelta ^{i}\), i.e., the number of steps performed by \(\mathfrak {C}\) in the worst case is \(O( \varDelta ^ {\bullet \mathfrak {a}^ {\bullet n} } )\). Those different h’s are obtained by intersecting the proximity classes of the heads of terms \(\{t_1,\ldots ,t_m\}\) in the Red rule. In our graph representation of the proximity relation, proximity classes of symbols are exactly the adjacency lists of those symbols which we assume are sorted. Their maximal length is \(\varDelta \). Hence, the work to be done at each node of the search tree of \(\mathfrak {C}\) is to find the intersection of at most n sorted lists, each containing at most \(\varDelta \) elements. It needs \(O(n\cdot \varDelta )\) time. It gives the time complexity \(O(n\cdot \varDelta \cdot \varDelta ^ {\bullet \mathfrak {a}^ {\bullet n} } )\) of \(\mathfrak {C}\) for the relation case.

In the mapping case, an application of the Red rule to a state \(\mathbf {I};s\) replaces one element of \(\mathbf {I}\) of size m by at most \(\mathfrak {a}\) new elements of the total size \(m-1\). Therefore, the maximal length of a branch is n, the branching factor is \(\varDelta \), and the amount of work at each node, like above, is \(O(n\cdot \varDelta )\). Hence, the number of steps in the worst case is \(O( \varDelta ^ {\bullet n} )\) and the time complexity of \(\mathfrak {C}\) is \(O(n\cdot \varDelta \cdot \varDelta ^ {\bullet n} )\).

The fact that consistency check is incorporated in the Dec rule in \(\mathfrak {A}^\mathsf {lin}_1\) can be used to guide the application of this rule, using the values memoized by the previous applications of Red. The very first time, the appropriate h in Dec is chosen arbitrarily. In any subsequent application of this rule, h is chosen according to the result of the Red rule that has already been applied to the arguments of the current AUT for their consistency check, as required by the condition of Dec. In this way, the applications of Dec and Sol will correspond to the applications of Red. There is a natural correspondence between the applications of Rem and Tri rules. Therefore, \(\mathfrak {A}_1^\mathsf {lin}\) will have the search tree analogous to that of \(\mathfrak {C}\). Hence the complexity of \(\mathfrak {A}_1^\mathsf {lin}\) is \(O(n\cdot \varDelta \cdot \varDelta ^ {\bullet \mathfrak {a}^ {\bullet n} } )\). \(\mathfrak {A}_2^\mathsf {lin}\) does not call the consistency check, but does the same work as \(\mathfrak {C}\) and, hence, has the same complexity \(O(n\cdot \varDelta \cdot \varDelta ^ {\bullet n} )\).    \(\Box \)

6 Discussion and Conclusion

The diagram below illustrates the connections between different anti-unification problems based on argument relations:

figure a

The arrows indicate the direction from more general problems to more specific ones. For the unrestricted cases (left column) we compute mcsrg’s. For correspondence relations and correspondence mappings (right column), mcsg’s are computed. (In fact, for them, the notions of mcsrg and mcsg coincide). The algorithms for relations (upper row) are more involved than those for mappings (lower row): Those for relations deal with AUTs containing arbitrary sets of terms, while for mappings, those sets have cardinality at most one, thus simplifying the conditions in the rules. Moreover, the two cases in the lower row generalize the existing anti-unification problems:

  • the unrestricted mappings case generalizes the problem from [1] by extending similarity to proximity and relaxing the smaller-side-totality restriction;

  • the correspondence mappings case generalizes the problem from [9] by allowing permutations between arguments of proximal function symbols.

All our algorithms can be easily turned into anti-unification algorithms for crisp tolerance relationsFootnote 3 by taking lambda-cuts and ignoring the computation of the approximation degrees. Besides, they are modular and can be used to compute only linear generalizations by just skipping the merging rule. We provided complexity estimations for the algorithms that compute linear generalizations (that often are of practical interest).

In this paper, we did not consider cases when the same pair of symbols is related to each other by more than one argument relation. Our results can be extended to them, that would open a way towards approximate anti-unification modulo background theories specified by shallow collapse-free axioms. Another interesting direction of future work would be extending our results to quantitative algebras [10] that also deal with quantitative extensions of equality.