HigherOrder Pattern AntiUnification in Linear Time
 731 Downloads
Abstract
We present a rulebased Huet’s style antiunification algorithm for simply typed lambdaterms, which computes a least general higherorder pattern generalization. For a pair of arbitrary terms of the same type, such a generalization always exists and is unique modulo \(\alpha \)equivalence and variable renaming. With a minor modification, the algorithm works for untyped lambdaterms as well. The time complexity of both algorithms is linear.
Keywords
Generalizations of lambda terms Antiunification Higherorder patterns1 Introduction
The antiunification problem of two terms \(t_1\) and \(t_2\) is concerned with finding their generalization, a term t such that both \(t_1\) and \(t_2\) are instances of t under some substitutions. Interesting generalizations are the least general ones. The purpose of antiunification algorithms is to compute such least general generalizations (lggs).
For higherorder terms, in general, there is no unique higherorder lgg. Therefore, special classes have been considered for which the uniqueness is guaranteed. One of such classes is formed by higherorder patterns. These are \(\lambda \)terms where the arguments of free variables are distinct bound variables. They were introduced by Miller [28] and gained popularity because of an attractive combination of expressive power and computational costs: There are practical unification algorithms [29, 30, 31] that compute most general unifiers whenever they exist. Pfenning [31] gave the first algorithm for higherorder pattern antiunification in the Calculus of Constructions, with the intention of using it for proof generalization.
Since then, there have been several approaches to higherorder antiunification, designing algorithms in various restricted cases. Motivated by applications in inductive learning, Feng and Muggleton [15] proposed antiunification in \(M\lambda \), which is essentially an extension of higherorder patterns by permitting free variables to apply to object terms, not only to bound variables. Object terms may contain constants, free variables, and variables which are bound outside of object terms. The algorithm has been implemented and was used for inductive generalization.
Pientka [32] studied antiunification of linear higherorder patterns in the framework of developing substitution tree indexing for higherorder terms. Linear higherorder patterns require that every metavariable occurs at most once in them, and they apply to all distinct bound variables in its context. The generalization algorithm has been used for the insertion of terms into the index.
Lu et al. [26] investigated antiunification in a restricted version of \(\lambda 2\) (a secondorder \(\lambda \)calculus with type variables [3]) and its applications in analogical programming and analogical theorem proving. The imposed restrictions guarantee uniqueness of the least general generalization. This algorithm as well as the one for higherorder patterns by Pfenning [31] have influenced the generalization algorithm used in the program transformation technique called supercompilation [27].
There are other fragments of higherorder antiunification, motivated by analogical reasoning. A restricted version of secondorder generalization has an application in the replay of program derivations [16]. A symbolic analogy model, called HeuristicDriven Theory Projection, uses yet another restriction of higherorder antiunification to detect analogies between different domains [21].
The last decade has seen a revived interest in antiunification. The problem has been studied in various theories (e.g., [1, 10, 22]) and from different application points of view (e.g., [2, 9, 21, 25, 26, 35]). A particularly interesting application comes from software code refactoring, to find similar pieces of code, e.g., in Python, Java [7, 8] and Erlang [25] programs. These approaches are based on the firstorder antiunification [33, 34]. To advance the refactoring and clone detection techniques for languages based on \(\lambda \) Prolog, one needs to employ antiunification for higherorder terms. Yet another motivation to look into the problem of higherorder antiunification in more detail would be the improvement of indexing techniques for \(\lambda \)terms used, e.g., in mathematical assistant systems.
In this paper, we revisit the problem of higherorder antiunification and present a rulebased antiunification algorithm (in the style of Huet [19]) for simply typed \(\lambda \)calculus. The input of the algorithm are arbitrary terms in \(\eta \)long \(\beta \)normal form. The output is a higherorder pattern. The global function for recording disagreements is represented as a store, in the spirit of Alpuente et al. [1]. We prove that a least general pattern generalization always exists and is unique modulo \(\alpha \)equivalence. The proposed algorithm computes it in linear time. As it is done in related work, we assume that symbols and pointers are encoded in constant space, and basic operations on them are performed in constant time. With a small modification, the algorithm works for untyped lambdacalculus as well.
This paper is an extended and improved version of our conference publication [5]. There, it is proved that the problem is solvable in cubic time. A free opensource implementation for both simply typed and untyped calculi of this previous version of the algorithm is available.
1.1 Comparison with Some Related Work

Pfenning [31] studied antiunification in the Calculus of Constructions, whose type system is richer than the simple types we consider. Both the input and the output was required to be higherorder patterns. Some questions have remained open, including the efficiency, applicability, and implementations of the algorithm. Due to the nature of type dependencies in the calculus, the author was not able to formulate the algorithm in Huet’s style [19], where a global function is used to guarantee that the same disagreements between the input terms are mapped to the same variable. The complexity has not been studied and the proofs of the algorithm properties have been just sketched.

Antiunification in \(M\lambda \) [15] is performed on simply typed terms, where both the input and the output are restricted to a certain extension of higherorder patterns. In this sense it is not comparable to our case, because we do not restrict the input, but require patterns in the output. Moreover, it contains neither the complexity analysis of the \(M\lambda \) antiunification algorithm nor the proofs of its properties.

The antiunification algorithm proposed by Pientka [32] also considers simply typed terms with the input and output restricted. The restriction requires terms to be linear higherorder patterns. Complexity results are not reported. This approach is also different from ours for the same reason as above: We do not restrict the input. It should be said that omitting one of the rules in our algorithm (the merging rule), we can also compute linear pattern generalizations for arbitrary input.

Antiunification for a restricted version of \(\lambda 2\) [26] requires the \(\lambda \)abstraction not to be used in arguments. The algorithm computes a generalization which is least general with respect to the combination of several orderings defined in the paper. The properties of the algorithm are formally proved, but the complexity has not been analyzed. As the authors point out, the orderings they define are not comparable with the ordering used to compute higherorder pattern generalizations.

Generalization algorithms proposed by Hirata et al. [18] work on secondorder terms which contain no \(\lambda \)abstractions. The output is also restricted: It may contain variables which can be instantiated with multihole contexts only. Varying restrictions on the instantiation, various versions of generalizations are obtained. This approach is not comparable with ours.

Yet another antiunification algorithm for \(\lambda \)abstractionfree terms has been developed for analogy making [21]. The application dictates the typical input to be firstorder, while their generalizations may contain secondorder variables. A certain measure is introduced to compare generalizations, and the algorithm computes those which are preferred by this measure. This approach is not comparable with ours either.

The approach of Hasker [16] is also different from what we do. The antiunification algorithm there works on a restriction of combinator terms and computes their generalizations (in quadratic time). It has been used for program derivation.
2 Preliminaries
In higherorder signatures we have types constructed from a set of basic types (typically \(\delta \)) using the grammar \(\tau \,{:}{:}{=}\, \delta \mid \tau \rightarrow \tau \) , where \(\rightarrow \) is associative to the right. Variables (typically \(X, Y, Z, x, y, z, a, b, \ldots \)) and constants (typically \(f, c, \ldots \)) have an assigned type.
Other standard notions of the simply typed \(\lambda \)calculus, like bound and free occurrences of variables, \(\alpha \)conversion, \(\beta \)reduction, \(\eta \)long \(\beta \)normal form, etc. are defined as usual (see [13]). By default, terms are assumed to be written in \(\eta \)long \(\beta \)normal form. Therefore, all terms have the form \(\lambda x_1, \ldots , x_n .h(t_1 , \ldots , t_m )\), where \(n, m \ge 0\), h is either a constant or a variable, \(t_1 , \ldots , t_m\) have also this form, and the term \(h(t_1 , \ldots , t_m )\) has a basic type.
The set of free variables of a term t is denoted by \(\mathrm {Vars}(t)\). When we write an equality between two \(\lambda \)terms, we mean that they are equivalent modulo \(\alpha \), \(\beta \) and \(\eta \) equivalence.
The size of a term t, denoted \({t}\), is defined recursively as \({h(t_1,\ldots ,t_n)}= 1+ \sum _{i=1}^n{t_i}\) and \({\lambda x\cdot t} = 1 + {t}\).
The depth of a term t, denoted \(\mathrm {Depth}(t)\) is defined recursively as \(\mathrm {Depth}(h(t_1,\ldots ,t_n))= 1+ \max _{i\in \{1,\dots ,n\}} \mathrm {Depth}(t_i)\) and \(\mathrm {Depth}(\lambda x\cdot t) = 1 + \mathrm {Depth}(t)\).
For a term \(t= \lambda x_1,\ldots ,x_n. h(t_1,\ldots ,t_m)\) with \(n,m\ge 0\), its head is defined as \(\mathrm {Head}(t)=h\).
Positions in \(\lambda \)terms are defined with respect to their tree representation in the usual way, as string of integers. For instance, in the term \(f(\lambda x.\lambda y\cdot g(\lambda z.h(z,y),\) \( x), \lambda u\cdot g(u))\), the symbol f stands in the position \(\epsilon \) (the empty sequence), the occurrence of \(\lambda x.\) stands in the position 1, the bound occurrence of y in 1.1.1.1.1.2, the bound occurrence of u in 2.1.1, etc.
The path to a position in a \(\lambda \)term is defined as the sequence of symbols from the root to the node at that position (not including) in the tree representation of the term. For instance, the path to the position 1.1.1.1.1 in \(f(\lambda x.\lambda y\cdot g(\lambda z\cdot h(z,y), x), \lambda u\cdot g(u))\) is \(f,\lambda x,\lambda y,g,\lambda z\).
A higherorder pattern is a \(\lambda \)term where, when written in \(\eta \)long \(\beta \)normal form, all free variable occurrences are applied to lists of pairwise distinct (\(\eta \)long forms of) bound variables. For instance, \(\lambda x. f (X (x), Y )\), \(f (c, \lambda x.x)\) and \(\lambda x.\lambda y\cdot X (\lambda z.x(z), y)\) are patterns, while \(\lambda x\cdot f (X (X (x)), Y )\), f (X (c), c) and \(\lambda x\cdot \lambda y\cdot X (x, x)\) are not.
Substitutions are finite sets of pairs \(\{X_1\mapsto t_1,\ldots , X_n\mapsto t_n\}\) where \(X_i\) and \(t_i\) have the same type and the X’s are pairwise distinct variables. They can be extended to type preserving functions from terms to terms as usual, avoiding variable capture. The notions of substitution domain and range are also standard and are denoted, respectively, by \(\mathrm {Dom}\) and \(\mathrm {Ran}\).
We use postfix notation for substitution applications, writing \(t\sigma \) instead of \(\sigma (t)\). As usual, the application \(t\sigma \) affects only the free occurrences of variables from \(\mathrm {Dom}(\sigma )\) in t. We write \(\vec {x}\sigma \) for \(x_1\sigma ,\ldots ,x_n\sigma \), if \(\vec {x}=x_1,\ldots ,x_n\). Similarly, for a set of terms S, we define \(S\sigma = \{t\sigma \mid t\in S\}\). The composition of \(\sigma \) and \(\vartheta \) is written as juxtaposition \(\sigma \vartheta \) and is defined as \(x(\sigma \vartheta ) = (x\sigma )\vartheta \) for all x. Yet another standard operation, restriction of a substitution \(\sigma \) to a set of variables S, is denoted by \(\sigma _S\).
A substitution \(\sigma _1\) is more general than \(\sigma _2\), written \(\sigma _1\preceq \sigma _2\), if there exists \(\vartheta \) such that \(X\sigma _1\vartheta = X\sigma _2\) for all \(X \in \mathrm {Dom}(\sigma _1)\cup \mathrm {Dom}(\sigma _2)\). The strict part of this relation is denoted by \(\prec \). The relation \(\preceq \) is a partial order and generates the equivalence relation which we denote by \(\simeq \). We overload \(\preceq \) by defining \(s\preceq t\) if there exists a substitution \(\sigma \) such that \(s\sigma = t\).
A term t is called a generalization or an antiinstance of two terms \(t_1\) and \(t_2\) if \(t\preceq t_1\) and \(t\preceq t_2\). It is a higherorder pattern generalization if additionally t is a higherorder pattern. It is the least general generalization, (lgg in short), aka a most specific antiinstance, of \(t_1\) and \(t_2\), if there is no generalization s of \(t_1\) and \(t_2\) which satisfies \(t\prec s\).

\(\lambda \vec {x}\cdot X(\vec {x})\), \(\lambda \vec {x}\cdot t\), and \(\lambda \vec {x}\cdot s\) are terms of the same type,

t and s are in \(\eta \)long \(\beta \)normal form, and

X does not occur in t and s.
An antiunifier of an AUP \(X(\vec {x}) :t\triangleq s\) is a substitution \(\sigma \) such that \(\mathrm {Dom}(\sigma ) = \{X\}\) and \(\lambda \vec {x}\cdot X(\vec {x})\sigma \) is a term which generalizes both \(\lambda \vec {x}\cdot t\) and \(\lambda \vec {x}\cdot s\).
An antiunifier of \(X(\vec {x}):t\triangleq s\) is least general (or most specific) if there is no antiunifier \(\vartheta \) of the same problem that satisfies \(\sigma \prec \vartheta \). Obviously, if \(\sigma \) is a least general antiunifier of an AUP \(X(\vec {x}):t\triangleq s\), then \(\lambda \vec {x}\cdot X(\vec {x})\sigma \) is a lgg of \(\lambda \vec {x}\cdot t\) and \(\lambda \vec {x}\cdot s\).
 Given:
Higherorder terms t and s of the same type in \(\eta \)long \(\beta \)normal form.
 Find:
A higherorder pattern generalization r of t and s.
Below we assume that in the AUPs of the form \(X(\vec {x}) : t \triangleq s\), the term \(\lambda \vec {x}\cdot X(\vec {x})\) is a higherorder pattern.
3 Transformation Rules for a Variant of HigherOrder AntiUnification
In this section we describe a set of transformation rules for higherorder antiunification. These rules work on triples \(A;S;\sigma \), which we call states. Here A is a set of AUPs of the form \(\{X_1(\vec {x_1}) : t_1 \triangleq s_1, \ldots , X_n(\vec {x_n}) : t_n \triangleq s_n\}\) that are pending to antiunify, S is a set of already solved AUPs (the store), and \(\sigma \) is a substitution (computed so far) mapping variables to patterns.
Remark 1
We assume that in the set \(A\cup S\) each occurrence of \(\lambda \) binds a distinct name variable (in other words, all names of bound variables are distinct), and that each \(X_i\) occurs in \(A\cup S\) only once.
Definition 1
To compute generalizations for terms t and s, we start with the initial state \(\{X : t \triangleq s\}; \varnothing ; \varnothing \), where X is a fresh variable, and apply the transformations as long as possible, until no transformation applies. These final states have the form \(\varnothing ; S; \varphi \), where Mer does not apply to S. Then, the result computed by \(\mathfrak {P}\) is \(X\varphi \).
One can easily show that a triple obtained from \(A;S;\sigma \) by applying any of the rules above to a state is indeed a state: For each expression \(X(\vec {x}): t \triangleq s \in A\cup S\), the terms \(X(\vec {x})\), t and s have the same type, \(\lambda \vec {x}\cdot X(\vec {x})\) is a higherorder pattern, s and t are in \(\eta \)long \(\beta \)normal form, and X does not occur in t and s. Moreover, all generalization variables are distinct and substitutions map variables to patterns.
The property that each occurrence of \(\lambda \) in \(A\cup S\) binds a unique variable is also maintained. It guarantees that in the Abs rule, the variable y is fresh for s. After the application of the rule, y will appear nowhere else in \(A\cup S\) except \(X'(\vec {x},y)\) and, maybe, t and s.
Like in the antiunification algorithms working on triple states [1, 22], the idea of the store here is to keep track of already solved AUPs in order to reuse an existing variable in generalizations. This is important, since we aim at computing lggs.
The Mer rule requires solving a matching problem Open image in new window with the substitution \(\pi \) which bijectively maps the variables from \(\vec {x}\) to the variables from \(\vec {y}\). In general, a matching problem is defined as follows.
Definition 2
(Permuting matcher) Given a set of pairs of terms in \(\eta \)long \(\beta \)normal form Open image in new window and two sets of variables D and R such that \(D\subseteq \bigcup _{i=1}^n \mathrm {Vars}(t_i)\) and \(R\subseteq \bigcup _{i=1}^n \mathrm {Vars}(s_i)\), a permuting matcher is a bijection \(\pi :D\rightarrow R\) such that, extended as a substitution \(\pi \) of variables \(x\in D\) by variables \(\pi (y)\in R\), satisfies \(t_i\pi = s_i\), for \(i=1,\dots ,n\).
The permuting matcher, if it exists, is unique^{2} and is denoted by \({\mathrm {match}}(D,R,P)\). When this map does not exist, we write \({\mathrm {match}}(D,R,P)=\bot \).
An algorithm that decides the existence of the permuting matcher and computes it in linear time is given in [5]. Here, in Sect. 5, we show that a more general problem can also be solved in linear time.
Example 1
 Let \(t= \lambda x,y. f(U(g(x),y), U(g(y),x))\) and \(s=\lambda x',y'. f(h(y',g(x')),\) \( h(x',g(y')))\). Then \(\mathfrak {P}\) performs the following transformations:The computed result is \(r= \lambda x,y. f(Y_1(x,y),Y_1(y,x))\). It generalizes the input terms t and s: \(r\{Y_1\mapsto \lambda x,y. U(g(x),y)\}=t\) and \(r\{Y_1 \mapsto \lambda x,y.h(y,g(x))\}=s\). These substitutions can be read from the final store.$$\begin{aligned} \begin{array}{ll} &{} \{X: \lambda x,y. f(U(g(x),y), U(g(y),x)) \triangleq \lambda x',y'. f(h(y',g(x')), h(x',g(y'))) \};\ \varnothing ;\ \varnothing \\ \Longrightarrow ^2_{\mathrm {Abs}}{} &{} \{X'(x,y): f(U(g(x),y), U(g(y),x)) \triangleq f(h(y,g(x)), h(x,g(y))) \};\ \varnothing ; \\ &{} \{X\mapsto \lambda x,y. X'(x,y)\} \\ \Longrightarrow _{\mathrm {Dec}}{} &{} \{Y_1(x,y): U(g(x),y) \triangleq h(y,g(x)), Y_2(x,y): U(g(y),x) \triangleq h(x,g(y)) \};\ \varnothing ;\\ &{} \{X\mapsto \lambda x,y. f(Y_1(x,y),Y_2(x,y)), X' \mapsto \lambda x,y. f(Y_1(x,y),Y_2(x,y))\} \\ \Longrightarrow _{\mathrm {Sol}}{} &{} \{Y_2(x,y): U(g(y),x) \triangleq h(x,g(y)) \}; \{Y_1(x,y): U(g(x),y) \triangleq h(y,g(x))\}; \\ &{} \{X\mapsto \lambda x,y. f(Y_1(x,y),Y_2(x,y)), X' \mapsto \lambda x,y. f(Y_1(x,y),Y_2(x,y))\} \\ \Longrightarrow _{\mathrm {Sol}}{} &{} \varnothing ;\ \{Y_1(x,y): U(g(x),y) \triangleq h(y,g(x)),Y_2(x,y): U(g(y),x) \triangleq h(x,g(y))\}; \\ &{} \{X\mapsto \lambda x,y. f(Y_1(x,y),Y_2(x,y)), X' \mapsto \lambda x,y. f(Y_1(x,y),Y_2(x,y))\} \\ \Longrightarrow _{\mathrm {Mer}} &{} \varnothing ;\ \{Y_1(x,y): U(g(x),y) \triangleq h(y,g(x))\};\ \{X\mapsto \lambda x,y. f(Y_1(x,y),Y_1(y,x)), \\ &{} \quad X' \mapsto \lambda x,y. f(Y_1(x,y),Y_1(y,x)), Y_2\mapsto \lambda x,y. Y_1(y,x)\} \end{array} \end{aligned}$$

For \(\lambda x,y,z. g( f(x, z), f(y, z), f(y, x))\) and \(\lambda x',y',z'. g( h(y',x'),\) \( h(x',y'), h(z',y'))\), \(\mathfrak {P}\) computes their generalization \(\lambda x,y,z. f(Y_1(x,y,z), Y_1(y,x,z), Y_1(y,z,x))\)

For \(\lambda x,y.f(\lambda z .U(z, y, x), U(x, y,x))\) and \(\lambda x',y'.f(\lambda z' . h(y', z',x'),\) \(h(y',x',x'))\), \(\mathfrak {P}\) computes their generalization \(\lambda x,y. f(\lambda z. Y_1(x,y,z),Y_2(x,y))\).
From the examples one can notice yet another advantage of using the store (besides helping in the merging): In the final state, it contains AUPs from which one can get the substitutions that show how the original terms can be obtained from the computed result.
4 Properties of the Set of Transformations \(\mathfrak {P}\)
In this section we will prove termination (Theorem 1), soundness (Theorem 2) and completeness (Theorem 3) of \(\mathfrak {P}\).
Theorem 1
(Termination) The set of transformations \(\mathfrak {P}\) is terminating.
Moreover, any transformation sequence starting in the state \(A;S;\sigma \) terminates in \(\mathscr {O}({A}+{S})\) steps.
Proof
We define the measure of a state as \(M(A;S;\sigma )=2\,{A}+{S}\). All rules in \(\mathfrak {P}\) strictly decrease this measure. \(\square \)
Theorem 2
 (a)
\(X\sigma \) is a higherorder pattern in \(\eta \)long \(\beta \)normal form,
 (b)
\(X\sigma \preceq t\) and \(X\sigma \preceq s\).
Proof
To prove that \(X\sigma \) is a higherorder pattern, we use the facts that first, X is a higher order pattern and, second, at each step \(A_1;S_1;\varphi \Longrightarrow A_2; S_2;\varphi \vartheta \) if \(X\varphi \) is a higherorder pattern, then \(X\varphi \vartheta \) is also a higherorder pattern. The latter property follows from stability of patterns under substitution application and the fact that substitutions in the rules map variables to higherorder patterns. As for \(X\sigma \) being in \(\eta \)long \(\beta \)normal form, this is guaranteed by the series of applications of the Abs rule, even if Dec introduces an AUP whose generalization term is not in this form. It finishes the proof of (a).

Dec: Here \(t=h(t_1,\ldots ,t_m)\), \(s=h(s_1,\ldots ,s_m)\), and \(\vartheta = \{X \mapsto \lambda \vec {x}\cdot h(Y_1(\vec {x}),\) \(\ldots ,Y_m(\vec {x}))\}\). Then \(X(\vec {x})\vartheta = h(Y_1(\vec {x}),\ldots ,Y_m(\vec {x}))\). Let \(\psi _1\) and \(\psi _2\) be substitutions defined, respectively, by \(Y_i\psi _1 = \lambda \vec {x}\cdot t_i\) and \(Y_i\psi _2 = \lambda \vec {x}\cdot s_i\) for all \(1\le i \le m\). Such substitutions obviously exist since the Y’s introduced by the Dec rule are fresh. Then \(X(\vec {x})\vartheta \psi _1 = h(t_1,\ldots ,t_m)\), \(X(\vec {x})\vartheta \psi _2 = h(s_1,\ldots ,s_m)\) and, hence, \(X(\vec {x})\vartheta \preceq t\) and \(X(\vec {x})\vartheta \preceq s\).

Abs: Here \(t= \lambda y_1.t'\), \(s= \lambda y_2.s'\), and \(\vartheta = \{X \mapsto \lambda \vec {x},y. X'(\vec {x},y)\}\). Then \(X(\vec {x})\vartheta = \lambda y. X'(\vec {x},y)\). Let \(\psi _1 = \{X'\mapsto \lambda \vec {x},y. t'\}\) and \(\psi _2 = \{X'\mapsto \lambda \vec {x},y. s'\}\). Then \(X(\vec {x})\vartheta \psi _1 = \lambda y.t' = t\), \(X(\vec {x})\vartheta \psi _2 = \lambda y.s' = s\), and, hence, \(X(\vec {x})\vartheta \preceq t\) and \(X(\vec {x})\vartheta \preceq s\).

Sol: We have \(\vartheta = \{X\mapsto \lambda \vec {x}\cdot Y(\vec {y})\}\), where \(\vec {y}\) is the subsequence of \(\vec {x}\) consisting of the variables that appear freely in t or s. Let \(\psi _1 = \{Y\mapsto \lambda \vec {y}\cdot t \}\) and \(\psi _2 = \{Y\mapsto \lambda \vec {y}\cdot s \}\). Then \(X(\vec {x})\vartheta \psi _1 = t\), \(X(\vec {x})\vartheta \psi _2 = s\), and, hence, \(X(\vec {x})\vartheta \preceq t\) and \(X(\vec {x})\vartheta \preceq s\).
Now, we proceed by induction on the length l of the transformation sequence. In fact, we will prove a more general statement: If \(A_0;S_0;\vartheta _0 \Longrightarrow ^* \varnothing ; S_n;\vartheta _0\vartheta _1\cdots \vartheta _n\) is a transformation sequence in \(\mathfrak {P}\), then for any \(X(\vec {x}):t \triangleq s\in A_0\cup S_0\) we have \(X(\vec {x})\vartheta _1\cdots \vartheta _n \preceq t \) and \(X(\vec {x})\vartheta _1\cdots \vartheta _n \preceq s\).
When \(l=1\), it is exactly the onestep case we just proved. Assume that the statement is true for any transformation sequence of the length n and prove it for a transformation sequence \(A_0;S_0;\vartheta _0 \Longrightarrow A_1;S_1;\vartheta _0\vartheta _1 \Longrightarrow ^* \varnothing ; S_n;\vartheta _0\vartheta _1\cdots \vartheta _n\) of the length \(n+1\).

Dec: \(t=h(t_1,\ldots ,t_m)\), \(s=h(s_1,\ldots ,s_m)\) and \(X(\vec {x})\vartheta _1^1 = h(Y_1(\vec {x}),\ldots ,\) \(Y_m(\vec {x}))\). By the induction hypothesis, \(Y_i(\vec {x})\vartheta _2^n\preceq t_i\) and \(Y_i(\vec {x})\vartheta _2^n\preceq s_i\) for all \(1\le i \le m\). By construction of \(\vartheta _2^n\), if there is \(U\in \mathrm {Vars}(\mathrm {Ran}(\vartheta _2^n))\), then there is an AUP of the form \(U(\vec {u}):t'\triangleq s'\in S_n\). Let \(\sigma \) (resp. \(\varphi \)) be a substitution which maps each such U to the corresponding \(t'\) (resp. \(s'\)). Then \(Y_i(\vec {x})\vartheta _2^n\sigma = t_i\) and \(Y_i(\vec {x})\vartheta _2^n\varphi = s_i\). Since \(X(\vec {x})\vartheta _1^n = h(Y_1(\vec {x}),\ldots ,Y_m(\vec {x}))\vartheta _2^n\), we get that \(X(\vec {x})\vartheta _1^n \sigma = t\), \(X(\vec {x})\vartheta _1^n \varphi = s\), and, hence, \(X(\vec {x})\vartheta _1^n \preceq t\) and \(X(\vec {x})\vartheta _1^n \preceq s\).

Abs: Here \(t= \lambda y_1.t'\), \(s= \lambda y_2.s'\), \(X(\vec {x})\vartheta _1^1= \lambda y. X'(\vec {x},y)\), and \(A_1\) contains the AUP \(X'(\vec {x},y): t'\{y_1\mapsto y\} \triangleq s'\{y_2\mapsto y\}\). By the induction hypothesis, \(X'(\vec {x},y)\vartheta _2^n\preceq t'\{y_1\mapsto y\}\) and \(X'(\vec {x},y)\vartheta _2^n\preceq s'\{y_1\mapsto y\}\). Since \(X(\vec {x})\vartheta _1^n= \lambda y. X'(\vec {x},y)\vartheta _2^n\) and due to the way how y was chosen, we finally get \(X(\vec {x})\vartheta _1^n\preceq \lambda y. t'\{y_1\mapsto y\}=t\) and \(X(\vec {x})\vartheta _1^n\preceq \lambda y. s'\{y_2\mapsto y\}=s\).

Sol: We have \(X(\vec {x})\vartheta _1^1 = Y(\vec {y})\) where Y is in the store. By the induction hypothesis, \(Y(\vec {y})\vartheta _2^n\preceq t\) and \(Y(\vec {y})\vartheta _2^n\preceq s\). Therefore, \(X(\vec {x})\vartheta _1^n \preceq t\) and \(X(\vec {x})\vartheta _1^n \preceq s\).
\(\square \)

\(\sigma \) is a substitution computed by \(\mathfrak {P}\) for \(X: t \triangleq s\);

the restriction of \(\sigma \) on X, \(\sigma _X\), is an antiunifier of \(X:t \triangleq s\) computed by \(\mathfrak {P}\).
Theorem 3
(Completeness) Let \(\lambda \vec {x}\cdot t_1\) and \(\lambda \vec {x}\cdot t_2\) be higherorder terms and \(\lambda \vec {x}\cdot s\) be a higherorder pattern such that \(\lambda \vec {x}\cdot s\) is a generalization of both \(\lambda \vec {x}\cdot t_1\) and \(\lambda \vec {x}\cdot t_2\). Then, there exists a transformation sequence \(\{X(\vec {x}): t_1\triangleq t_2\};\varnothing ;\varnothing \Longrightarrow ^* \varnothing ; S;\sigma \) in \(\mathfrak {P}\) such that \(\lambda \vec {x}\cdot s \preceq X\sigma \).
Proof
By structural induction on s. We can assume without loss of generality that \(\lambda \vec {x}\cdot s\) is an lgg of \(\lambda \vec {x}\cdot t_1\) and \(\lambda \vec {x}\cdot t_2\). We also assume that it is in the \(\eta \)long \(\beta \)normal form.
If s is a variable, then there are two cases: Either \(s\in \vec {x}\), or \(s\notin \vec {x}\). In the first case, we have \(s=t_1 = t_2\). The Dec rule gives \(\sigma = \{X\mapsto \lambda \vec {x}\cdot s\}\) and, hence, \(\lambda \vec {x}\cdot s \preceq \lambda \vec {x}\cdot X(\vec {x})\sigma = s\). In the second case, either \(\mathrm {Head}(t_1)\ne \mathrm {Head}(t_2)\), or \(\mathrm {Head}(t_1)= \mathrm {Head}(t_2) \notin \vec {x}\). Sol is supposed to give us \(\sigma =\{X\mapsto \lambda \vec {x}\cdot X'(\vec {x'})\}\), where \(\vec {x'}\) is a subsequence of \(\vec {x}\) consisting of variables occurring freely in \(t_1\) or in \(t_2\). But \(\vec {x'}\) should be empty, because otherwise s would not be just a variable (remember that \(\lambda \vec {x}\cdot s\) is an lgg of \(\lambda \vec {x}\cdot t_1\) and \(\lambda \vec {x}\cdot t_2\) in the \(\eta \)long \(\beta \)normal form). Hence, we have \(\sigma =\{X\mapsto \lambda \vec {x}\cdot X'\}\) and \(\lambda \vec {x}\cdot s\preceq \lambda \vec {x}\cdot X(\vec {x})\sigma \), because \(s\{s\mapsto X'\} = X(\vec {x})\sigma \).
If s is a constant c, then \(t_1=t_2 =c\). We can apply the Dec rule, obtaining \(\sigma = \{X\mapsto \lambda \vec {x}\cdot c\}\) and, hence, \(s = c \preceq X(\vec {x})\sigma = c\). Therefore, \(\lambda \vec {x}\cdot s \preceq \lambda \vec {x}\cdot X(\vec {x})\sigma \).
If \(s = \lambda x. s'\), then \(t_1\) and \(t_2\) must have the forms \(t_1 = \lambda x. t'_1\) and \(t_2=\lambda y. t'_2\), and \(s'\) must be an lgg of \(t'_1\) and \(t'_2\). Abs gives a new state \(\{X'(\vec {x},x):t'_1\triangleq t'_2\{x\mapsto y\}\};\varnothing ;\sigma _1\), where \(\sigma _1 = \{X\mapsto \lambda \vec {x},x. X'(\vec {x},x)\}\). By the induction hypothesis, we can compute a substitution \(\sigma _2\) such that \(\lambda \vec {x},x.s'\preceq \lambda \vec {x},x.X'(\vec {x},x)\sigma _2\). Composing \(\sigma _1\) and \(\sigma _2\) into \(\sigma \), we have \(X(\vec {x})\sigma = \lambda x. X'(\vec {x},x)\sigma _2\). Hence, we get \(\lambda \vec {x}\cdot s = \lambda \vec {x}\cdot \lambda x. s' \preceq \lambda \vec {x}\cdot \lambda x. X'(\vec {x},x)\sigma _2 = \lambda \vec {x}\cdot X(\vec {x})\sigma \).
Finally, assume that s is a compound term \(h(s_1,\ldots ,s_n)\). If \(h\notin \vec {x}\) is a variable, then \(s_1,\ldots ,s_n\) are distinct variables from \(\vec {x}\) (because \(\lambda \vec {x}\cdot s\) is a higherorder pattern). That means that \(s_1,\ldots ,s_n\) appear freely in \(t_1\) or \(t_2\). Moreover, either \(\mathrm {Head}(t_1)\ne \mathrm {Head}(t_2)\), or \(\mathrm {Head}(t_1)=\mathrm {Head}(t_2) = h\). In both cases, we can apply the Sol rule to obtain \(\sigma = \{X\mapsto \lambda \vec {x}\cdot Y(s_1,\ldots ,s_n)\}\). Obviously, \(\lambda \vec {x}\cdot s \preceq \lambda \vec {x}\cdot X(\vec {x})\sigma = \lambda \vec {x}\cdot Y(s_1,\ldots ,s_n)\).
If \(h\in \vec {x}\) or if it is a constant, then we should have \(\mathrm {Head}(t_1)=\mathrm {Head}(t_2)\). Assume they have the forms \(t_1=h(t^1_1,\ldots ,t^1_n)\) and \(t_2=h(t^2_1,\ldots ,t^2_n)\). We proceed by the Dec rule, obtaining \(\{Y_i(\vec {x}):t^1_i \triangleq t^2_i \mid 1\le i \le n\};\varnothing ;\sigma _0\), where \(\sigma _0 = \{X \mapsto \lambda \vec {x}\cdot h(Y_1(\vec {x}),\ldots , Y_n(\vec {x}))\}\). By the induction hypothesis, we can construct transformation sequences \(\Delta _1,\ldots ,\Delta _n\) computing the substitutions \(\sigma _1,\ldots ,\sigma _n\), respectively, such that \(\lambda \vec {x}\cdot s_i\preceq \lambda \vec {x}\cdot Y_i(\vec {x})\sigma _i\) for \(1\le i \le n\). These transformation sequences, together with the initial Dec step, can be combined into one transformation sequence, of the form \(\Delta = \{X(\vec {x}):t_1 \triangleq t_2 \};\varnothing ;\sigma _0\Longrightarrow \{Y_i(\vec {x}):t^1_i \triangleq t^2_i \mid 1\le i \le n\};\varnothing ;\sigma _0 \Longrightarrow ^* \varnothing ; S_n; \sigma _0\sigma _1\cdots \sigma _n\).
 (a)
\(t_1_{p_1}\) and \(t_1_{p_2}\) differ from each other by a permutation of variables bound in \(t_1\),
 (b)
\(t_2_{p_1}\) and \(t_2_{p_2}\) differ from each other by the same (modulo variable renaming) permutation of variables bound in \(t_2\),
 (c)
the path to \(p_1\) is the same (modulo bound variable renaming) in \(t_1\) and \(t_2\). It equals (modulo bound variable renaming) the path to \(p_1\) in s, and
 (d)
the path to \(p_2\) is the same (modulo bound variable renaming) in \(t_1\) and \(t_2\). It equals (modulo bound variable renaming) the path to \(p_2\) in s.
Depending which AUP is selected to perform a transformation, there can be different transformation sequences in \(\mathfrak {P}\) starting from the same initial state, but leading to different generalizations. The next theorem states that all those generalizations are equivalent.
Theorem 4
(Uniqueness Modulo \(\simeq \) ) Let \(\{X: t\triangleq s\};\varnothing ;\varnothing \Longrightarrow ^* \varnothing ; S_1;\sigma _1\) and \(\{X: t\triangleq s\};\) \(\varnothing ;\varnothing \Longrightarrow ^* \varnothing ; S_2;\sigma _2\) be two transformation sequences in \(\mathfrak {P}\), where \(\varnothing ; S_1;\sigma _1\) and \(\varnothing ; S_2;\sigma _2\) are final states. Then \(X\sigma _1 \simeq X\sigma _2\).
Proof
It is not hard to notice that if it is possible to change the order of applications of rules (but sticking to the same selected AUPs for each rule) then the result remains the same: If \(\Delta _1 = A_1;S_1;\sigma _1 \Longrightarrow _{\mathrm {R1}} A_2;S_2;\sigma _1\vartheta _1 \Longrightarrow _{\mathrm {R2}} A_3; S_3; \sigma _1\vartheta _1\vartheta _2\) and \(\Delta _2= A_1;S_1;\sigma _1 \Longrightarrow _{\mathrm {R2}} A'_2;S'_2;\sigma _1\vartheta _2 \Longrightarrow _{\mathrm {R1}} A'_3; S'_3; \sigma _1\vartheta _2\vartheta _1\) are two twostep transformation sequences, where R1 and R2 are (not necessarily different) rules and each of them transforms the same AUP(s) in both \(\Delta _1\) and \(\Delta _2\), then \(A_3=A'_3\), \(S_3=S'_3\), and \(\sigma _1\vartheta _1\vartheta _2=\sigma _1\vartheta _2\vartheta _1\) (modulo the names of fresh variables).
Dec, Abs and Sol rules transform the selected AUP in a unique way. We show that it is irrelevant in which order we perform matching in the Mer rule.
Let Open image in new window \( \sigma \{Y \mapsto \lambda \vec {y}\cdot Z(\vec {z}\pi )\} \) be the merging step with \(\pi ={\mathrm {match}}(\{\vec {z}\}, \{\vec {y}\},\) Open image in new window . If we do it in the other way around, we would get the step Open image in new window , where \(\mu ={\mathrm {match}}(\{\vec {y}\},\) Open image in new window . But \(\mu =\pi ^{1}\), because of bijection.
Next, starting from \(X\vartheta _1\rho _2\), we can transform it as \( X\vartheta _1\rho _2= X\sigma \rho _1\rho _2 = X\sigma \{Y \mapsto \lambda \vec {y}\cdot Z(\vec {z}\pi )\rho _2,Z \mapsto \lambda \vec {z}\cdot Y(\vec {y}\pi ^{1} )\} =_{\text {by (1)}} X\sigma \{Y \mapsto \lambda \vec {y}\cdot Z(\vec {z}\pi )\rho _2,Z \mapsto \lambda \vec {z}\cdot Y(\vec {y}\pi ^{1} )\}= X\sigma \{Y \mapsto \lambda \vec {y}\cdot Y(\vec {y}),Z \mapsto \lambda \vec {z}\cdot Y(\vec {y}\pi ^{1} )\} = X\sigma \{Y \mapsto \lambda \vec {y}\cdot Y(\vec {y})\}\{Z \mapsto \lambda \vec {z}\cdot Y(\vec {y}\pi ^{1} )\}.\) At this step, since the equality \(=\) is \(\alpha \beta \eta \)equivalence, we can omit the application of the substitution \(\{Y \mapsto \lambda \vec {y}\cdot Y(\vec {y})\}\) and proceed: \( X\sigma \{Y \mapsto \lambda \vec {y}\cdot Y(\vec {y})\}\{Z \mapsto \lambda \vec {z}\cdot Y(\vec {y}\pi ^{1} )\} = X\sigma \{Z \mapsto \lambda \vec {z}\cdot Y(\vec {y}\pi ^{1} )\} = X\sigma \rho _2 X\vartheta _2.\) Hence, we got \(X\vartheta _1\rho _2=X\vartheta _2\), which implies \(X\vartheta _1 \preceq X\vartheta _2\).
The fact \(X\vartheta _2 \preceq X\vartheta _1\) can be proved analogously. Hence, \(X\vartheta _1\simeq X\vartheta _2\), which means that it is irrelevant in which order we perform matching in the Merge rule. Therefore, no matter how different transformation sequences are constructed, the computed generalizations are equivalent. \(\square \)
Corollary 1
For any given terms t and s, and any transformation sequence \(\{X: t\triangleq s\};\varnothing ;\varnothing \Longrightarrow ^* \varnothing ; S;\sigma \) in \(\mathfrak {P}\), the higherorder pattern \(X\sigma \) is the unique least general generalization of t and s.
5 Complexity
In this section we describe an algorithm based on the set of transformations \(\mathfrak {P}\) and prove that this algorithm has linear time complexity. Notice that the Termination Theorem (Theorem 1) already proves that any transformation sequence in \(\mathfrak {P}\) has at most linear length. However, the direct application of one transformation rule, as described in Sect. 3, may require quadratic time, what would result in a cubic time algorithm, similar to our previous one [5]. In this section we will improve this result.
Remark 2
 1.
All pointers require constant space and all basic operations on them can be done in constant time. This assumption is popular in the literature, despite the fact that it is inaccurate: In any implementation of trees based on the use of pointers, these will need space \(\mathscr {O}(\log n)\), since they address a memory of size \(\mathscr {O}(n)\). The same argument applies to all traditional algorithms for firstorder unification. In fact, all those claimed to be linear, without this assumption would have \(\mathscr {O}(n\,\log n)\) time complexity. Therefore, we will continue with this traditional assumption.
 2.
All symbols can be represented in constant space and all basic operations on them done in constant time. Again, this assumption is traditional, but inaccurate. A term with \(\mathscr {O}(n)\) symbols, where all of them can be distinct, would require \(\mathscr {O}(\log n)\) space to represent each single symbol.
 3.
When we represent lambdaterms in de Bruijn form, indexes of bound variables can be represented in constant space.
The proposed algorithm works in three phases. Later, we will prove that each one of them can be done in linear time. The following lemma allows us to decompose any transformation sequence in the three phases that we will analyze separately.
Lemma 1
Proof
We apply rules Dec and Abs exhaustively until all equations can only be transformed by the Sol rule. Notice that conditions of applicability of Dec and Abs and of Sol are disjoint. Then, in the second phase we only apply the Sol rule. Notice that this rule does not introduce new equations where Dec and Abs could be applicable. Then, in the third phase, when the first component of the tuple is empty, we only apply the Mer rule. Again, notice that this rule does not introduce new equations in the first component. By Theorem 4, following this strategy, we get an equivalent antiunifier. \(\square \)
Example 2
Lemma 2
Proof
In the Dec rule, we reuse the representation of \(t_i\) from \(f(t_1,\dots ,t_m)\) to get a representation of each \(t_i\). Using a directed acyclic graph, we also reuse the argument vector \(\vec {x}\) from the representation of \(X(\vec {x})\) in the original AUP and \(\sigma \) to construct the representation of each \(Y_i(\vec {x})\) in the new AUPs and substitution. We assume that in constant time, by using an appropriate hash table we can find the unique occurrence of X in \(\sigma \), and hence compute \(\sigma \{X\mapsto \lambda \vec {x}\cdot h(Y_1(\vec {x}),\ldots ,Y_m(\vec {x}))\}\) in time \(\mathscr {O}(m)\). Notice that \(\beta \)reduction is trivial in this use case. Therefore, the rule can be applied in time \(\mathscr {O}(m)\), and the space requirements also increase in \(\mathscr {O}(m)\).
In the Abs rule, we reuse the representation of \(\lambda y.t\) to construct the representation of t. We also reuse the representation of \(\vec {x}\) from \(X(\vec {x})\) when we construct the representation of \(X'(\vec {x},y)\). The most expensive step is to compute the substitution \(s\{z\mapsto y\}\). We assume that, using an appropriated data structure where all occurrences of the bound variable z are linked, this can be done in linear time on the number of occurrence of this variable. This structure can be constructed for the initial problem in linear time.
Finally, \({P} \le t+s\) is proved by inspection of the rules. \(\square \)
Lemma 3
Proof
In this second phase, we basically move equations from the problem set P to the store S. However, notice that the arguments of generalization variables in \(X(\vec {x}):t\triangleq s\) are narrowed in \(Y(\vec {y}):t\triangleq s\), where \(\vec {y}\) is a subset of \(\vec {x}\). As only those argument variables which appear in one of the terms t and s are kept, the length of the narrowed vector \(\vec {y}\) is bound by \({t}+{s}\).
There is no need to share the representation of those narrowed argument vectors \(\vec {y}\) anymore. The representation of \(\vec {y}\) can be constructed without reusing the representation of \(\vec {x}\) in linear time on the size of the AUP \(X(\vec {x}) : t \triangleq s\). The substitution composition \(\sigma \{X\mapsto \lambda \vec {x}\cdot Y(\vec {y})\}\) used by Sol is equivalent to replacing the subterm \(X(\vec {x})\) by \(Y(\vec {y})\). Again, this composition can also be done in linear time on \({X(\vec {x}) : t \triangleq s}\). Therefore, all the phase can be computed on time \(\mathscr {O}(P)\).
Finally, \({S}={P}\) by inspection of the rule. \(\square \)
As a direct consequence of the previous two lemmas, we can conclude that the size of the store S, after the first and second phase of the antiunification algorithm, is linear on the size of the original problem.
In order to prove that the third phase can also be implemented with a linear time algorithm, we can not directly use the rules described in previous section. This would lead to a cubic time algorithm, similar to our previous one [5].
First, we will reduce computation of permuting matchers to \(\alpha \)equivalence, using the following lemma.
Lemma 4
Given two terms t and s in \(\eta \)long \(\beta \)normal form, and two sets of variables \(D\subseteq \mathrm {Vars}(t)\) and \(R\subseteq \mathrm {Vars}(s)\), we have Open image in new window if, and only if, \(\lambda \vec {x}\cdot t\) is \(\alpha \)equivalent to \(\lambda \vec {y}\cdot s\), where \(\vec {x}\) (resp. \(\vec {y}\)) is an ordering of the set D (resp. R) such that, for any \(i<j\), the variable \(x_i\) occurs free for the first time in t (resp s) before the first occurrence of \(x_j\) with respect to the depthfirst preorder traversal of t (resp s).
Moreover, we can decide if Open image in new window in time \(\mathscr {O}({t}+{s})\).
Proof
Two terms \(\lambda \vec {x}\cdot t\) and \(\lambda \vec {y}\cdot s\) are \(\alpha \)equivalent if, and only if, \(t[\vec {x}\mapsto \vec {y}]\) and s are \(\alpha \)equivalent. If all variables \(\{\vec {x}\}\) occur free in t, all variables \(\{\vec {y}\}\) free in s, and t and s are in \(\beta \)normal form, then \(t[\vec {x}\mapsto \vec {y}]\) and s are \(\alpha \)equivalent if, and only if, Open image in new window . Now, notice that, if Open image in new window , then the permuting matcher must be \([\vec {x}\mapsto \vec {y}]\), with sequences \(\vec {x}\) and \(\vec {y}\) constructed as stated in the lemma. Altogether, this proves the first part of the lemma.
The second part of the lemma relies on the first part. Given a term t (resp. s), we can close it, adding lambdabindings for all free variables in D (resp. R) in the same order they appear for the first time in the term w.r.t. the depthfirst preorder traversal. Then, we can transform it into de Bruijn form. Both processes can be done in linear time.^{3} Then, to check if the two closing sequences (sequences of lambdabindings) yield a permuting matcher for the two terms, we only need to check, in linear time, if the de Bruijn forms of the closed representations are equal. \(\square \)
However, we still have the problem that this should be repeated for any pair of AUPs in the store. A naive implementation of this process would result into a quadratic algorithm in the size of the store. However, this can be done in quasilinear time using the following result.
Lemma 5
Given a sequence of numbers \(S=n_1,\dots ,n_m\) written in binary, we can partition this set into subsets of equal numbers in time \(\mathscr {O}(S\,\log S)\).
Proof
If we assume that numbers are bounded and representable in constant space, then we can use a perfect hash table. We add each number in the hash table, and construct the subsets from them in time \(\mathscr {O}(m)=\mathscr {O}(S)\).
If numbers are unbounded, then we can use a similar idea, without using hash functions. We use a trie, i.e. a binary tree T such that, for any \(i=1,\dots ,m\), tree T contains a node at position \(n_i\) with label i. Starting with the empty tree, we proceed adding all necessary nodes to ensure that \(n_i\) is a position in the tree. This can be done in linear time on the representation of \(n_i\), i.e. \(\mathscr {O}(\log n_i)=\mathscr {O}(n_i)\). Then, we add label i to the corresponding node, on time linear on the representation of i. At the end, sets of labels will represent subsets of equal numbers in S.
Since \(m=\mathscr {O}(n)\),we have \(i= \mathscr {O}(\log S)\). Therefore, this algorithm requires time \(\mathscr {O}(S\,\log S)\). \(\square \)
Using the same ideas as Hinze [17], the previous lemma can be generalized when, instead of numbers written in binary, we have other objects represented in a fixed signature. In our case, we will use the algorithm described in the proof of Lemma 5 to find subsets of \(\alpha \)equivalent \(\lambda \)terms. For this purpose, we translate \(\alpha \)terms into their de Bruijn form, and then represent them by their preorder traversal sequence.
We are now able to establish the complexity of the third phase of the algorithm.
Lemma 6
(Third phase) There exists an algorithm that, given a state \(\varnothing ;S;\sigma \), computes \(S'\) and \(\sigma '\), where \(\varnothing ;S;\sigma \Longrightarrow ^*_{\textsf {Mer}} \varnothing ;S';\sigma '\), in time \(\mathscr {O}(S)\).
Proof
As we have described in the proof of Lemma 4, we close all terms, adding lambda binders for all free occurrences of variables, and translate them into de Bruijn form. This can be done in linear time. By Lemma 4, to decide the existence of a permuting matcher for two terms, we only need to check \(\alpha \)equivalence, i.e., equality once the term is represented in de Bruijn form. Here we assume that every de Bruijn index only requires constant space in order to be represented. Assuming also that the rest of symbols also require constant space, we can represent the terms by their preorder traversal sequence. This traversal sequence has size linear on the size of the term. By Lemma 5, we can partition the store S into subsets of equal terms, i.e. subsets of permuting matchers. Then, applying rule Mer to all AUPs of the subset, we remove all but one representative of each subset. In each application, the cost of applying the rule is linear on the size of the removed equation, since the side condition of the rule has already been checked. \(\square \)
From Lemmas 1, 2, 3 and 6 we can conclude the following result.
Theorem 5
(Complexity of \(\mathfrak {P}\) ) Computing the least general higherorder pattern generalization of two simply typed lambdaterms in \(\eta \)long \(\beta \)normal form has linear time complexity on the size of the input.
Remark 3
The previous result is only valid under the assumption that all pointers, de Bruijn indexes, and representation of symbols only require constant space. Without these assumptions, we could prove that our antiunification algorithm—like all traditional unification algorithms—is, in fact, quasilinear. One could argue that in traditional unification algorithms, input terms are also represented as trees encoded with pointers; hence, input size is also quasilinear on the number of symbols of the terms, like the complexity of the algorithm; therefore, the algorithm is quasilinear in the number of symbols, but linear in the size of the input. However, this is not true. As Jacobsen proves, for trees with a constant number of distinct nodes, there exist succinct representations that only require linear space on the number of nodes [20]. In this representation, even to simply access a node’s child requires logarithmic time. Since in complexity theory inputs are assumed to be represented succinctly, even the traditional algorithm for tree traversal strictly requires quasilinear time.
6 Conclusion and Final Remarks
We designed an algorithm for computing higherorder pattern generalizations for simply typed lambda terms. The algorithm does not assume any restriction on the input except requiring them to be terms of the same type in the \(\eta \)long \(\beta \)normal form. The computed pattern is a least general generalization of the input terms, and is unique modulo free variable renaming and \(\alpha \)equivalence. It is computed in linear time in the size of input (under the usual conventions made in the unification literature on the space requirements for encoding symbols and pointers, and on the complexity of basic operations on them).
One can observe that the set of transformations \(\mathfrak {P}\) used in the paper can be adapted with a relatively little effort to work on untyped terms (cf. the formulation of the unification algorithm both for untyped and simply typed patterns [30]). One thing to be added is lazy \(\eta \)expansion: The AUP of the form \(X(\vec {x}): \lambda y.t \triangleq h(s_1,\ldots ,s_m)\) should be transformed into \(X(\vec {x}): \lambda y.t \triangleq \lambda z.h(s_1,\ldots ,s_m,z)\) for a fresh z. (Dually for abstractions in the right hand side). In addition, Sol needs an extra condition for the case when \(\mathrm {Head}(t)=\mathrm {Head}(s)\) but the terms have different number of arguments such as, e.g., in f(a, x) and f(b, x, y). Note that the complexity of the algorithm will remain linear in the untyped case, since the side enlargements in the lazy \(\eta \)expansion are bounded by the size of the original problem in such a way that in the worst case, by summing up all enlargements one would only duplicate the size.
The antiunification algorithm has been implemented (both for simply typed and untyped terms, without perfect hashing, using a simpler but more expensive method to compute permuting matchers) in Java as a part of an opensource antiunification library [4]. It can be used online or can be downloaded freely from http://www.risc.jku.at/projects/stout/software/hoau.php.
As for the related topics, we mention nominal antiunification. Several authors explored relationship between nominal terms and higherorder patterns (see, e.g., [12, 14, 23, 24] among others), proposing translations between them in the context of unification. However, it is not immediately clear how to reuse those translations for antiunification, in particular, how to get nominal generalizations from pattern generalizations. Therefore, we proposed a direct algorithm for nominal antiunification [6].
Studying antiunification in the calculi with more complex type systems, such as the extension of the system F with subtyping \(F_{<:}\) [11], would be a very interesting direction of future work, as it may have applications in clone detection and refactoring for the functional programming languages in the ML family.
Footnotes
 1.
 2.
In the uniqueness of permuting matchers it plays an important role the fact that D and R are subsets of the free variables occurring in \(t_i\) and \(s_i\), respectively, and that these terms are in \(\beta \)normal form.
 3.
Strictly speaking, this is false in the worst case, since we can have a linear number of bound variables, and it would require de Bruijn indexes that can only be represented in logarithmic space. Then, the resulting algorithm would be quasilinear. However, as we do for encodings of symbols and pointers, we will follow the standard practice and assume that de Bruijn indexes only require constant space.
Notes
Acknowledgments
Open access funding provided by Johannes Kepler University.
References
 1.Alpuente, M., Escobar, S., Espert, J., Meseguer, J.: A modular ordersorted equational generalization algorithm. Inf. Comput. 235, 98–136 (2014)MathSciNetCrossRefMATHGoogle Scholar
 2.Armengol, E., Plaza, E.: Bottomup induction of feature terms. Mach. Learn. 41(3), 259–294 (2000)CrossRefMATHGoogle Scholar
 3.Barendregt, H.: Introduction to generalized type systems. J. Funct. Progr. 1(2), 125–154 (1991)MathSciNetMATHGoogle Scholar
 4.Baumgartner, A., Kutsia, T.: A library of antiunification algorithms. In: Fermé E., Leite J. (eds.) Logics in Artificial Intelligence—14th European Conference, JELIA 2014, Funchal, Madeira, Portugal, 24–26 Sept 2014. Proceedings, Springer, Lecture Notes in Computer Science, vol. 8761, pp. 543–557 (2014)Google Scholar
 5.Baumgartner, A., Kutsia, T., Levy, J., Villaret, M.: A variant of higherorder antiunification. In: van Raamsdonk F. (ed.) 24th International Conference on Rewriting Techniques and Applications, RTA 2013, 24–26 June 2013, Eindhoven, The Netherlands, Schloss Dagstuhl  LeibnizZentrum fuer Informatik, LIPIcs, vol. 21, pp. 113–127 (2013)Google Scholar
 6.Baumgartner, A., Kutsia, T., Levy, J., Villaret, M.: Nominal antiunification. In: Fernández M. (ed.) 26th International Conference on Rewriting Techniques and Applications, RTA 2015, 29 June to 1 July 2015, Warsaw, Poland, Schloss Dagstuhl  LeibnizZentrum fuer Informatik, LIPIcs, vol. 36, pp. 57–73 (2015)Google Scholar
 7.Bulychev, P.: Duplicate code detection using Clone Digger. Python Mag. 9, 18–24 (2008)Google Scholar
 8.Bulychev, P., Minea, M.: An evaluation of duplicate code detection using antiunification. In: Proceedings of the 3rd International Workshop on Software Clones, pp. 22–27, iESE Report 038.09/E (2009)Google Scholar
 9.Bulychev, P.E., Kostylev, E.V., Zakharov, V.A. Antiunification algorithms and their applications in program analysis. In: Pnueli A., Virbitskaite I., Voronkov A. (eds.) Perspectives of Systems Informatics, 7th International Andrei Ershov Memorial Conference, PSI 2009, Novosibirsk, Russia, 15–19 June 2009. Revised Papers, Springer, Lecture Notes in Computer Science, vol. 5947, pp. 413–423 (2009)Google Scholar
 10.Burghardt, J.: Egeneralization using grammars. Artif. Intel. 165(1), 1–35 (2005)MathSciNetCrossRefMATHGoogle Scholar
 11.Cardelli, L., Martini, S., Mitchell, J.C., Scedrov, A.: An extension of system F with subtyping. Inf. Comput. 109(1/2), 4–56 (1994)MathSciNetCrossRefMATHGoogle Scholar
 12.Cheney, J.: Relating higherorder pattern unification and nominal unification. In: Proceedings of 19th International Workshop on Unification, UNIF’05, pp. 104–119 (2005)Google Scholar
 13.Dowek, G.: Higherorder unification and matching. In: Robinson, J.A., Voronkov, A. (eds.) Handbook of Automated Reasoning, pp. 1009–1062. Elsevier and MIT Press, New York (2001)CrossRefGoogle Scholar
 14.Dowek, G., Gabbay, M.J., Mulligan, D.P.: Permissive nominal terms and their unification: an infinite, coinfinite approach to nominal techniques. Log. J. IGPL 18(6), 769–822 (2010)MathSciNetCrossRefMATHGoogle Scholar
 15.Feng, C., Muggleton, S.: Towards inductive generalization in higher order logic. In: Sleeman D.H., Edwards P. (eds.) Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), Aberdeen, Scotland, UK, 1–3 July 1992, Morgan Kaufmann, pp. 154–162 (1992)Google Scholar
 16.Hasker, R.W.: The replay of program derivations. PhD thesis, University of Illinois at UrbanaChampaign (1995)Google Scholar
 17.Hinze, R.: Generalizing generalized tries. J. Funct. Prog. 10(4), 327–351 (2000)MathSciNetCrossRefMATHGoogle Scholar
 18.Hirata, K., Ogawa, T., Harao, M.: Generalization algorithms for secondorder terms. In: Camacho R., King R.D., Srinivasan A. (eds.) Inductive Logic Programming, 14th International Conference, ILP 2004, Porto, Portugal, 6–8 Sept 2004, Proceedings, Springer, Lecture Notes in Computer Science, vol. 3194, pp. 147–163 (2004)Google Scholar
 19.Huet, G.: Résolution d’équations dans des langages d’ordre \(1,2,\ldots ,\omega \). PhD thesis, Université Paris VII (1976)Google Scholar
 20.Jacobsen, E.: Unification and antiunification. http://erikjacobsen.com/pdf/unification.pdf (1991)
 21.Krumnack, U., Schwering, A., Gust, H., Kühnberger, K.U.: Restricted higherorder antiunification for analogy making. In: Orgun M.A., Thornton J. (eds.) AI 2007: Advances in Artificial Intelligence, 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, 2–6 Dec 2007, Proceedings, Springer, Lecture Notes in Computer Science, vol. 4830, pp. 273–282 (2007)Google Scholar
 22.Kutsia, T., Levy, J., Villaret, M.: Antiunification for unranked terms and hedges. J. Autom. Reason. 52(2), 155–190 (2014)MathSciNetCrossRefMATHGoogle Scholar
 23.Levy, J., Villaret, M.: Nominal unification from a higherorder perspective. In: Voronkov A. (ed.) Rewriting Techniques and Applications, 19th International Conference, RTA 2008, Hagenberg, Austria, 15–17 July 2008, Proceedings, Springer, Lecture Notes in Computer Science, vol. 5117, pp. 246–260 (2008)Google Scholar
 24.Levy, J., Villaret, M.: Nominal unification from a higherorder perspective. ACM Trans. Comput. Log. 13(2), 10 (2012)MathSciNetCrossRefMATHGoogle Scholar
 25.Li, H., Thompson, S.J.: Similar code detection and elimination for Erlang programs. In: Carro M., Peña R. (eds.) Practical Aspects of Declarative Languages, 12th International Symposium, PADL 2010, Madrid, Spain, 18–19 Jan 2010. Proceedings, Springer, Lecture Notes in Computer Science, vol. 5937, pp. 104–118 (2010)Google Scholar
 26.Lu, J., Mylopoulos, J., Harao, M., Hagiya, M.: Higher order generalization and its application in program verification. Ann. Math. Artif. Intel. 28(1–4), 107–126 (2000)MathSciNetCrossRefMATHGoogle Scholar
 27.MendelGleason, G.: Types and verification for infinite state systems. PhD thesis, Dublin City University (2012)Google Scholar
 28.Miller, D.: A logic programming language with lambdaabstraction, function variables, and simple unification. J. Log. Comput. 1(4), 497–536 (1991)MathSciNetCrossRefMATHGoogle Scholar
 29.Nadathur, G., Linnell, N.: Practical higherorder pattern unification with onthefly raising. In: Gabbrielli M., Gupta G. (eds.) Logic Programming, 21st International Conference, ICLP 2005, Sitges, Spain, 2–5 Oct 2005, Proceedings, Springer, Lecture Notes in Computer Science, vol. 3668, pp. 371–386 (2005)Google Scholar
 30.Nipkow, T.: Functional unification of higherorder patterns. In: Proceedings of the Eighth Annual Symposium on Logic in Computer Science (LICS ’93), Montreal, Canada, 19–23 June 1993, IEEE Computer Society, pp. 64–74 (1993)Google Scholar
 31.Pfenning, F.: Unification and antiunification in the calculus of constructions. In: Proceedings of the Sixth Annual Symposium on Logic in Computer Science (LICS ’91), Amsterdam, The Netherlands, 15–18 July 1991, IEEE Computer Society, pp. 74–85 (1991)Google Scholar
 32.Pientka, B.: Higherorder term indexing using substitution trees. ACM Trans. Comput. Log. (2009). doi: 10.1145/1614431.1614437
 33.Plotkin, G.D.: A note on inductive generalization. Mach. Intel. 5(1), 153–163 (1970)MathSciNetMATHGoogle Scholar
 34.Reynolds, J.C.: Transformational systems and the algebraic structure of atomic formulas. Mach. Intel. 5(1), 135–151 (1970)MathSciNetMATHGoogle Scholar
 35.Schmid, U.: Inductive Synthesis of Functional Programs, Universal Planning, Folding of Finite Programs, and Schema Abstraction by Analogical Reasoning, Lecture Notes in Computer Science, vol. 2654. Springer (2003)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.