Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Unification solves equations over terms. For a unification problem \(M = N\), a unification algorithm finds a substitution \(\delta = [X := P, Y := Q, \ldots ]\) for unknown variables X and Y occurring in terms M and N so that applying \(\delta \) to the original problem make \(\delta (M)\) and \(\delta (N)\) equal. Depending on the terms occurring in the unification problem, a unification algorithm is classified as (standard) first-order unification and higher-order unification, where higher-order unification solves equations over higher-order terms such as \(\lambda \)-terms. First-order unification is simple in theory and efficient in implementation [7, 11], whereas higher-order unification is more complex both in theory and implementation [5].

The reason why higher-order unification is complex is that they solve equations of terms modulo \(\alpha \)-, \(\beta \)- and possibly \(\eta \)-equivalence, denoted as \(=_{\alpha \beta \eta }\). Alpha-equivalence equates two \(\lambda \)-terms M and N up to the renaming of their bound variables, denoted as \(M =_{\alpha } N\); \(\beta \)-equivalence equates two terms under \((\lambda a.M )N =_{\beta } M[a:=N] \); and \(\eta \)-equivalence states that \((\lambda a.M a) =_{\eta }M \) where a does not occur free in M. Although higher-order unification is required in logic programming languages and proof assistants based on higher-order approach [9], full higher-order unification is undecidable and may not generate most general unifiers. Higher-order pattern unification is a simple version of higher-order unification which solves terms modulo \(\alpha \beta _0 \eta \)-equivalence [8], where \(\beta _0\)-equivalence is a form of \(\beta \)-equivalence \((\lambda x.M)N =_{\beta _0} M[x := N]\) where N must be a variable not occurring free in \(\lambda x.M\). Most importantly, it is an efficient process with linear-time decidability [8, 18]. Higher-order pattern unification is popular in practice because of that. For instance, the latest implementation of \(\lambda \) Prolog is actually an implementation of a sublanguage of \(\lambda \) Prolog called \(L_{\lambda }\), which only uses higher-order pattern unification [10]. However, the infrastructure for implementing a variant of the \(\lambda \)-calculus is not lightweight, and a restriction to \(\beta _0\)-equivalence asks users for good programming practice to avoid cases which do not respect the restriction. A first-order style unification algorithm for terms involving name binding is preferred in these respects.

One such unification algorithm is nominal unification [14], which solves equations of nominal terms. In nominal terms, names are equipped with the swapping operation and the freshness condition [4]. The work in [2, 6] shows the connection between nominal unification and higher-order pattern unification; if two nominal terms are unifiable, then their translated higher-order pattern counterparts are also unifiable. Alpha-equivalence is assumed for higher-order terms in theory. Yet, in the higher-order approach, implementing a meta-language (a variant of the typed \(\lambda \)-calculus) means that one must also consider \(=_{ \beta _0 \eta }\). In nominal unification, only \(=_{\alpha }\) is needed, and variable capture is allowed during the unification in the sense that a unifier may bring a name a into the scope of a as in \((\lambda a.X)[X:=a]\). Nominal unification solves problems in two phases; solving equations of terms and solving freshness constraints.

Using graphs to represent \(\lambda \)-terms has a long history [19, 20]. In our earlier work, we studied a hypergraph-based technique for representing terms involving name binding [16], using HyperLMNtal [13] as a representation and implementation language. The idea was that hypergraphs could naturally express terms containing bindings; atoms (nodes of graphs) represent constructors such as abstraction and application; hyperlinks (edges with multiple endpoints) represent variables; and regular links (edges with two endpoints) connect constructors with each other. In this technique, two isomorphic (but not identical) hypergraphs representing \(\alpha \)-equivalent terms containing bindings have two syntactically different textual representations in HyperLMNtal. For example, two instances of the \(\lambda \)-term \(\lambda a.aa\) are represented by \(\alpha \)-equivalent but syntactically different hypergraphs such as abs(A,(app(A,A)),L) and abs(B,(app(B,B)),R) as shown in Fig. 1.

Fig. 1.
figure 1

Two \(\alpha \)-equivalent terms represented as hypergraphs

In Fig. 1, circles are atoms, straight lines are regular links and eight-point stars with curved lines are hyperlinks. The arrowheads on circles indicate the first arguments of atoms and the ordering of their arguments. These two hypergraphs, rooted at L and R, are isomorphic, i.e., have the same shape, but are syntactically not identical. (Later, we explain why regular links between abs and app atoms are implicit in the above two terms.)

Our idea was first proposed in [16], where we developed the theory with the encoding of the untyped \(\lambda \)-calculus. Our formalism separates bound and free variables by Barendregt’s variable convention [1] and also requires bound variables to be distinct from each other. A graph type called hlground (meaning ground graphs made up of hyperlinks) keeps bound variables distinct during the substitution. For example, \(\lambda a.M\) and \(\lambda a.N\) do not exist at the same time, and if \(\lambda a.M\) exists, a may occur in M only. Such conventions may look too strict, but our experiences show that it brings great convenience in practice. For example, in our recent work [17], we encoded System \(F_\texttt {<:}\) easily in HyperLMNtal; implementing the type checking of System \(F_\texttt {<:}\) required the equality checking of types containing type variable binders, which was handled by directly applying \(\alpha \)-equality rules in theory. As the next step, we want to implement the type inference of System \(F_\texttt {<:}\), which means that we should study the unification of terms containing name binding within our formalism.

Hypergraphs representing \(\lambda \)-terms are called hypergraph \(\lambda \)-terms. This paper considers unification problems for equations over hypergraph \(\lambda \)-terms modulo \(=_{\alpha }\). Hypergraph \(\lambda \)-terms have nice properties; for two abstractions L=abs(A,M) and R=abs(B,N), A does not occur in N and B does not occur in M, and \(\texttt {A}\) and \(\texttt {B}\) are always different hyperlinks. These properties greatly simplified the reasoning in our previous work, and we expect such simplicity in this work as well.

The outline of the paper is as follows. In Sect. 2, we briefly describe hypergraph \(\lambda \)-terms and the definition of substitutions. In Sect. 3, we present the unification algorithm and related proofs. In Sect. 4, we give some examples. In Sect. 5, we briefly describe the implementation of the unification algorithm. In Sect. 6, we review related work and conclude the paper.

2 Hypergraph \(\lambda \)-Terms

HyperLMNtal is a modeling language based on hypergraph rewriting [13] that is intended to be a substrate language of diverse computational models, especially those addressing concurrency, mobility and multiset rewriting. Moreover, we have successfully encoded the \(\lambda \)-calculus with strong reduction in HyperLMNtal in two different ways, one in the fine-grained approach [12] and the other in the coarse-grained approach [16]. This paper takes the latter approach that uses hyperlinks to represent binders, where the representation of \(\lambda \)-terms is called hypergraph \(\lambda \)-terms. We briefly describe HyperLMNtal and hypergraph \(\lambda \)-terms.

2.1 HyperLMNtal

In HyperLMNtal, hypergraphs consist of graph nodes called atoms, undirected edges with two endpoints called regular links and edges with multiple endpoints called hyperlinks. The simplified syntax of hypergraphs in HyperLMNtal is as follows,

$$\begin{aligned} (\textit{Hypergraphs})\;\; P \,{\texttt {:}{} \texttt {:}}= 0 \;\,| \;\, p(A_1,\ldots ,A_m) \;\, | \;\, P,P \end{aligned}$$

where link names (denoted by \(A_i\)) and atom names (denoted by p) are presupposed. Hypergraphs are the principal syntactic category: 0 is an empty hypergraph; \(p(A_1,\ldots ,A_m)\) is an atom with arity m; and PP is parallel composition. A hypergraph P is transformed by a rewrite rule of the form \(H \mathop {\textit{:-}} G \texttt {|} B\) when a subgraph of P matches (i.e., is isomorphic to) H and auxiliary conditions specified in G are satisfied, in which case the subgraph of P is rewritten into another hypergraph B. The auxiliary conditions include type constraints and equality constraints. In HyperLMNtal programs, names starting with lowercase letters denote atoms and names starting with uppercase letters denote links. An abbreviation called term notation is frequently used in HyperLMNtal programs. It allows an atom b without its final argument to occur as an argument of a when these two arguments are interconnected by regular links. For instance, f(a,b) represents the graph f(A,B),a(A),b(B), and C=app(A,B) represents the graph app(A,B,C). The latter example shows that an n-ary constructor can be represented by an \((n+1)\)-ary HyperLMNtal atom whose final argument stands for the root link of the constructor.

In a rewrite rule, placing a constraint in the guard means that A is created as a hyperlink with an attribute a given as a natural number. A type constraint specified in the guard describes a class of graphs with specific shapes. For example, a graph type hlink( A ) ensures that A is a hyperlink occurrence. A graph type hlground( \(A,a_1,\ldots , a_n\) ) identifies a subgraph rooted at the link A, where \(a_1,\ldots , a_n\) are the attributes of hyperlinks which are allowed to occur in the subgraph. The identified subgraph may be copied or removed according to rewrite rules. Details appear in Sect. 2.2.

2.2 Hypergraph \(\lambda \)-Terms

We write hypergraph \(\lambda \)-terms by the following syntax.

figure a

Here, the A are hyperlinks whose attributes are determined as follows: hyperlinks representing variables bound inside M or in a larger term containing M are given attribute 1 (denoted \(A^1\)), while those not bound anywhere are given attribute 2 (denoted \(A^2\)). Hypergraph \(\lambda \)-terms are straightforwardly obtained from \(\lambda \)-terms. For example, the Church numeral 2

$$\begin{aligned} \lambda x. \lambda y. x (x y ) \end{aligned}$$

is written as

$$ \texttt {R=abs(A,abs(B,app(A,app(A,B))))}. $$

Note that both abs and app are ternary atoms, where their third arguments, made implicit by the term notation, are links connected to their parent atoms or represented by the leftmost R.

The following rewrite rules shows how to work with hypergraph \(\lambda \)-terms in HyperLMNtal.

figure b

The first rule creates a hypergraph representing the Church numeral 2. The second rule creates an application of two Church numerals.

The idea behind the hypergraph-based approach is that it applies the principle of Barendregt’s variable convention (bound variables should be separated from free variables to allow easy reasoning) also to bound variables; all bound variables should be distinct from each other upon creation and should be kept distinct from each other during substitution. Besides keeping bound variables distinct, one should avoid variable capture during substitution.

In a substitution \((\lambda y.M)[x:=N]\), replacing x with N in M will not lead to variable capture if y is kept distinct from the variables of N. The idea is to ensure that variables appear distinctly in \(M_1\) and \(M_2\) in an application \(M_1 M_2\). Concretely, in a substitution \((M_1 M_2)[x:=N]\), we generate two \(\alpha \)-equivalent but syntactically different copies of N, say \(N_1\) and \(N_2\), to have \((M_1[x:=N_1])(M_2[x:=N_2])\). For a hypergraph \(\lambda \)-term with distinct variables, applying such strategy in the substitution ensures that \(y \notin fv(N)\) for \((\lambda y.M)[x:=N]\). To summarize, we use distinct hyperlinks with appropriate attributes to represent distinct variables of \(\lambda \)-terms and don’t allow multiple binders of the same variable.

We use sub atoms to represent substitutions; represents \(M[x:=N]\). The definition of substitutions for hypergraph \(\lambda \)-terms is given in Fig. 2, where each rule is prefixed by a rule name. The rule beta implements \(\beta \)-reduction, and the other four rules implement substitutions. When the rule var2 is applied, a subgraph matched with hlground(N,1) is removed. When the rule app is applied, two \(\alpha \)-equivalent but syntactically different copies of a subgraph matched by hlground(N,1) are created. The hlink(X) checks if X is a hyperlink.

Fig. 2.
figure 2

Definition of substitutions on hypergraph \(\lambda \)-terms

The graph type hlground(N,1) identifies a subgraph rooted at \(\texttt {N}\), then rewriting may copy or remove the subgraph. When copying a subgraph identified by hlground(N,1) in a rule, it creates fresh copies of hyperlinks which have the attribute 1 and have no occurrences outside of the subgraph, while it shares hyperlinks which have the attribute 1 but have occurrences outside of the subgraph between the copies of the subgraph. It always shares hyperlinks which have an attribute different from 1 between the copies of the subgraph. When removing a subgraph identified by hlground(N,1) in a rule, it removes the subgraph along with all hyperlink endpoints in the subgraph.

Fig. 3.
figure 3

Applying a substitution on an application

For example, the rule app rewrites R=sub(A,abs(B,B),app(A,A)) in Fig. 3a to R=app(sub(A,abs(K,K),A),sub(A,abs(H,H),A)) in Fig. 3b, where the constraint hlground(N,1) identifies a subgraph N=abs(B,B) which is copied into abs(K,K) and abs(H,H). The rule var2 rewrites R=abs(A,sub(B,A,C)) in Fig. 3c to R=abs(A,C) in Fig. 3d, where hlground(N,1) identifies a subgraph N=A and then the subgraph containing one endpoint of A is removed. For more details of hlground, readers are referred to our previous work [16].

3 Unification

We extend hypergraph \(\lambda \)-terms with unknown variables of unification problems, denoted by \(X,Y,\ldots \), in a standard manner. Let ABCD be hyperlinks, MNP be some hypergraph \(\lambda \)-terms, and LR be regular links occurring as the last arguments of the atoms representing \(\lambda \)-term constructors.

The assumed equality between hypergraph \(\lambda \)-terms in our unification is \(\alpha \)-equivalence with freshness constraints. When no confusion may arise, we write \(=\) instead of \(=_{\alpha }\) for the sake of simplicity. For a unification problem \(M=N\) of two hypergraphs M and N containing unknown variables \(X,Y, \ldots \), the goal is to find hypergraph \(\lambda \)-terms which replace \(X,Y, \ldots \) and ensure the \(\alpha \)-equivalence of M and N. To reason about the equality of non-ground hypergraph \(\lambda \)-terms (hypergraphs containing unknown variables), we use the concepts of swapping \(\leftrightarrow \) and freshness \(\#\) from the nominal approach [4].

Lemma 1

In hypergraph \(\lambda \)-terms, for an abstraction , the hyperlink A occurs in M only.

Proof

Follows from the construction of hypergraph \(\lambda \)-terms.     \(\square \)

Henceforth, note that the last arguments of atoms representing \(\lambda \)-term constructors are implicit in terms related by \(=\) and \(\#\).

Lemma 2

For two \(\alpha \)-equivalent hypergraph \(\lambda \)-terms

$$\begin{aligned} \texttt {abs}(A,M)=\texttt {abs}(B,N)\ , \end{aligned}$$

the following holds,

  • \(A \# N\) and \( B \# M \),

  • \(M=[ A \leftrightarrow B]N\) and \([ A \leftrightarrow B] M = N \),

where \(A \# N\) denotes that A is fresh for N (or A is not in N) and \([ A \leftrightarrow B]N\) denotes the swapping of A and B in N.

Proof

Follows from Lemma 1 and the fact that hyperlinks representing bound variables are distinct in hypergraph \(\lambda \)-terms.     \(\square \)

In Lemma 2, we could use renaming \(M=[ A\)/B]N and [B/\(A] M = N \) instead of swapping, where [A/B]N means replacing B by A in N. Moving [A/B] to the left-hand side of \(=\) requires the switching of A and B. Using swapping saves us from such switching operation in the implementation. Another point is that it is clear from their definitions that swapping subsumes renaming. In \([ A \leftrightarrow B] N \), swapping \([ A \leftrightarrow B] \) applies to every hyperlink in N until it reaches an unknown variable X occurring in N. We suspend swapping when it encounters an unknown variable X until X is instantiated to a non-variable term in the future.

Definition 1

Let \(\pi \) be a list of swappings \([ A_1 \leftrightarrow B_1,\ldots , A_n \leftrightarrow B_n ]\), \(var(\pi ) = \{ A_1,B_1, \ldots , A_n,B_n \}\), and \(\pi ^{-1}=[ A_n \leftrightarrow B_n, \ldots , A_1 \leftrightarrow B_1 ]\). Applying \(\pi \) to a term M is written as \(\pi \varvec{\cdot } M\). When M is an unknown variable X, we call \(\pi \varvec{\cdot } M\) a suspension. The inductive definition of applying swappings to hypergraph \(\lambda \)-terms is defined as follows, where \(\pi @ \pi '\) is a concatenation of \(\pi \) and \(\pi '\).

$$\begin{aligned} \pi @ [A \leftrightarrow C] \varvec{\cdot } B&{\mathop {=}\limits ^{\text {def}}} \pi \varvec{\cdot } B \qquad (A \ne B, B \ne C)\\ \pi @ [A \leftrightarrow C] \varvec{\cdot } A&{\mathop {=}\limits ^{\text {def}}} \pi \varvec{\cdot } C \\ \pi @ [C \leftrightarrow A] \varvec{\cdot } A&{\mathop {=}\limits ^{\text {def}}} \pi \varvec{\cdot } C \\ \pi \varvec{\cdot } \texttt {abs}(A,M)&{\mathop {=}\limits ^{\text {def}}} \texttt {abs}(A, \pi \varvec{\cdot } M) \\ \pi \varvec{\cdot } \texttt {app}(M,N)&{\mathop {=}\limits ^{\text {def}}} \texttt {app}(\pi \varvec{\cdot } M, \pi \varvec{\cdot } N) \\ \pi \varvec{\cdot } (\pi ' \varvec{\cdot } M)&{\mathop {=}\limits ^{\text {def}}} \pi @ \pi ' \varvec{\cdot } M \\ [\,] \varvec{\cdot } M&{\mathop {=}\limits ^{\text {def}}} M \\ \end{aligned}$$

We don’t apply swapping to hyperlinks representing the bound variables of an abs (the fourth rule in Definition 1) because all bound variables are distinct in hypergraph \(\lambda \)-terms, and a swapping is only created from two abstractions using the rule =abs in Fig. 4. We use a freshness constraint \(\#\) in the equality judgment of non-ground hypergraph \(\lambda \)-terms, and write \(\theta \vdash M = N\) to denote that M and N are \(\alpha \)-equal terms under a set \(\theta \) of freshness constraints called a freshness environment. For example,

$$ \{ \texttt {A} \#X, \texttt {B} \#X \} \vdash \texttt {abs(A,}\ X\mathtt{)} = \texttt {abs(B,}\ X\mathtt{)} $$

is a valid judgment. Likewise, we write \(\theta \vdash A \# M\) to say that \( A \# M\) holds under \(\theta \). For example, \(\texttt {A} \# X \vdash \texttt {A} \# \texttt {app(}{} \textit{X}{} \mathtt{,B)}\) is a valid judgment. With swapping and freshness constraints, judging the equality of two non-ground hypergraph \(\lambda \)-terms is simple, as shown in Fig. 4.

Fig. 4.
figure 4

The equality and freshness judgments for non-ground hypergraph \(\lambda \)-terms

The soundness of most of the rules in Fig. 4 should be self-evident. Below we give some lemmas to justify =susp and #susp. It is important to note that the rules in Fig. 4 are assumed to be used in a goal-directed manner starting from hypergraph \(\lambda \)-terms M and N. In the following lemmas, “obtained by applying rules in Fig. 4 and Definition 1” means that we use the rules in Fig. 4 in goal-directed, backward manner and the rules in Definition 1 in the left-to-right direction. By doing so, we come up with a set of unification rules which works on two unifiable terms and fails for two non-unifiable terms.

When judging the equality of two non-ground hypergraph \(\lambda \)-terms using the rules in Fig. 4, swappings are only generated by the rule =abs, and these swappings are applied to terms by the rules in Definition 1. During such process, we may have terms such as \(\theta \vdash \pi \varvec{\cdot } M = \pi ' \varvec{\cdot } N \) and \(\theta \vdash A \# \pi \varvec{\cdot } M \). As mentioned before, a swapping is always created from two abstractions which have distinct bound hyperlinks. Therefore, in a judgment, swappings enjoy the following properties: Each swapping always has two distinct hyperlinks, and two swappings generated by the rule =abs have no hyperlinks in common. For example, in a judgment, there are no swappings such as \([ A \leftrightarrow A ] \) and \([ A \leftrightarrow B, B \leftrightarrow C ] \).

Lemma 3

If the judgment

$$\begin{aligned} \theta \vdash \pi \varvec{\cdot } M = \pi ' \varvec{\cdot } N \end{aligned}$$

is obtained by applying rules in Fig. 4 and Definition 1, then \(var(\pi ) \cap var(\pi ') = \emptyset \) holds.

Proof

Follows from the fact that hyperlinks of a swapping are distinct.     \(\square \)

Note that the rules in Fig. 4 and Definition 1 generate non-empty swappings only to the right-hand side of equations, so the \(\pi \) above is actually empty. Nevertheless, we have non-empty swappings in the left-hand side in this and the following lemmas because the claims generalize to equations generated by the unification algorithm described later in Fig. 5.

Lemma 4

If the judgment

$$ \theta \vdash \pi \varvec{\cdot } \texttt {abs(A,M)} = \pi ' \varvec{\cdot } \texttt {abs(B,N)}, $$

is obtained by applying rules in Fig. 4 and Definition 1, then \(A \notin var(\pi @\pi ')\) and \(B \notin var(\pi @\pi ')\) hold.

Proof

The same as the proof of Lemma 3.     \(\square \)

The next lemma states how swappings move between two sides of \(=\) in a judgment.

Lemma 5

\(\theta \vdash M = \pi \varvec{\cdot } N\) obtained by applying rules in Fig. 4 and Definition 1 holds if and only if \(\theta \vdash \pi ^{-1} \varvec{\cdot } M = N\) holds.

Proof

\((\Rightarrow )\) Let \(\pi =[A_1 \leftrightarrow B_1,\ldots , A_n \leftrightarrow B_n]\). Because freshness constraints are generated only from the rule =abs, we can assume that \(A_1, \ldots , A_n\) occur only in N, that \(B_1,\ldots ,B_n\) occur only in M, and that \(\theta \) contains \(\{ A_1 \# M, \ldots , A_n \# M,\) \(B_1 \# N, \ldots , B_n \# N \}\). If \(N=A_i\) for some i, then \(M=B_i\) by assumption and the rule =hlink, in which case \(\pi ^{-1}\cdot M=A_i\) and the lemma holds. If N is a hyperlink not in \(var(\pi )\), then M and N are the same hyperlink not in \(var(\pi )\) and the lemma holds obviously. If N is an unknown variable, the lemma is again obvious from the rule =susp. The other cases are straightforward by structural induction.

\((\Leftarrow )\) The proof of the other direction is similar.     \(\square \)

The next lemma justifies the rule #susp in Fig. 4.

Lemma 6

\(\theta \vdash A \, \# \, \pi \varvec{\cdot } M \) obtained by applying rules in Fig. 4 and Definition 1 holds if and only if \(\theta \vdash \pi ^{-1} \varvec{\cdot } A \, \# \, M\) holds.

Proof

\((\Rightarrow )\) By Lemma 4 and the fact that freshness constraints are created by the rule =abs, we know that \(A \not \in var(\pi )\). Therefore, if \(\theta \vdash A \, \# \, \pi \varvec{\cdot } M \), \(\theta \vdash \pi ^{-1} \varvec{\cdot } A \, \# \, M\) holds.

\((\Leftarrow )\) For the same reason, \(A \not \in var(\pi ^{-1})\). Therefore, if \(\theta \vdash \pi ^{-1} \varvec{\cdot } A \, \# \, M\) holds, \(\theta \vdash A \,\# \, \pi \varvec{\cdot } M \) holds.     \(\square \)

The next lemma justifies the rule =susp in Fig. 4.

Lemma 7

\(\theta \vdash \pi \varvec{\cdot } M = \pi ' \varvec{\cdot } M \) obtained by applying rules in Fig. 4 and Definition 1 holds for \(\pi \) and \(\pi '\) if and only if \(A \# M \in \theta \) for all \(A \in var( \pi @ \pi ')\).

Proof

\((\Rightarrow )\) By lemma 3, we know that \(var(\pi ) \cap var(\pi ') = \emptyset \). Therefore, in order for \(\theta \vdash \pi \varvec{\cdot } M = \pi ' \varvec{\cdot } M \) to hold, \(\pi \) and \(\pi '\) should have no effects on M, which means \(var(\pi @ \pi ') \cap var(M) = \emptyset \), which is the same as \(A \# M \in \theta \) for all \(A \in var(\pi @ \pi ')\).

\((\Leftarrow )\) If \(A \# M \in \theta \) for all \( A \in var(\pi @ \pi ')\), obviously, \(\theta \vdash \pi \varvec{\cdot } M = \pi ' \varvec{\cdot } M\) holds.     \(\square \)

Theorem 1

The relation \(=\) defined in Fig. 4 is an equivalence relation, i.e.,

  1. (a)

    \(\theta \vdash M=M\),

  2. (b)

    \(\theta \vdash M=N\) implies \(\theta \vdash N=M\),

  3. (c)

    \(\theta \vdash M=N\) and \(\theta \vdash N=P\) implies \(\theta \vdash M=P\).

Proof

  • (a) When M is a hyperlink A, then \(A = A\) follows from the rule =hlink. When M is an abstraction, note that M stands for an \(\alpha \)-equivalence class. For example, M stands for either M = abs(A,A) or M = abs(B,B). Assume \(P=P\) (as induction hypothesis), \(A \# P\), and that B occurs in P, then \(P = [A \leftrightarrow B]@[B \leftrightarrow A]\varvec{\cdot }P\) holds. Let \(N = [B \leftrightarrow A] \varvec{\cdot } P\), then it is clear that \(B \# N\). Clearly, \(\texttt {abs(B,P)} = \texttt {abs(A,N)}\) holds, therefore \(M=M\) holds for abstractions. When M is an application, the proof is again by structural induction. The equivalence of terms containing suspension follows from the rule =susp and Lemma 7.

  • (b) When M and N are hyperlinks, \(\vdash M=N\) by the rule =hlink simply implies \(\vdash N=M\). When M and N are M = abs( \(A,N_1\) ) and N = abs( \(B,N_2\) ) respectively, \( \vdash M = N\) leads to \( \vdash N_1 = [A \leftrightarrow B]\varvec{\cdot }N_2\), \(\vdash A \# N_2\) and \(\vdash B \# N_1\) by the rule =abs. By Lemma 5 and the induction hypothesis, we have \( \vdash N_2 = [A \leftrightarrow B] \varvec{\cdot } N_1 \), \(\vdash A \# N_2 \) and \( \vdash B \# N_1\) which leads to abs( \(B,N_2\) ) = abs( \(A,N_1\) ). When M and N are applications, the proof is by the rule =app and using the induction hypothesis twice. The equivalence of terms containing suspension follows from the rule =susp and Lemma 7.

  • (c) When MN and P are hyperlinks, it holds. When MN and P are M = abs( \(A,M_1\) ), N = abs( \(B,M_2\) ) and P = abs( \(C,M_3\) ), we have \( \vdash M_1=[A \leftrightarrow B] \varvec{\cdot } M_2 \), \( \vdash A\# M_2 \), \( \vdash B\#M_1\) and \( \vdash M_2=[B \leftrightarrow C] \varvec{\cdot } M_3 \), \( \vdash B\# M_3 \), \( \vdash C \# M_2\) by =abs. By Lemma 1, we know that \(A \# M_3\) and \(C \# M_1\). By Lemma 5 and the induction hypothesis, we have \(\{ A\#M_3, C\#M_1 \} \vdash M_1=[A \leftrightarrow B]@[B \leftrightarrow C] \varvec{\cdot } M_3\), which is the same as \(\{ A\#M_3, C\#M_1 \} \vdash M_1= [A \leftrightarrow C ] \varvec{\cdot } M_3\), which leads to \(\vdash \texttt {abs(}A,M_1\texttt {)} = \texttt {abs(}C,M_3\texttt {)}\) by =abs. The proof of applications is trivial. The equivalence of terms containing suspension follows from the rule =susp and Lemma 7.

    \(\square \)

Fig. 5.
figure 5

Unification of hypergraph \(\lambda \)-terms

A substitution \(\delta \) is a finite set of mappings from unknown variables to terms, written as \( [X := M_1, Y := M_2, \ldots ]\) where its domain, \(\textit{dom}(\delta )\), is a set of distinct unknown variables \(\{X, Y, \ldots \}\). Applying \(\delta \) to a term M is written as \(\delta (M)\) and is defined in a standard manner. A composition of substitutions is written as \(\delta \circ \delta '\) and defined as \((\delta \circ \delta ') (M) = \delta (\delta '(M))\). The \(\varepsilon \) denotes an identity substitution. Substitution commutes with swapping; i.e., \(\delta (\pi \varvec{\cdot } M) = \pi \varvec{\cdot } (\delta (M))\). For example, applying \([X:=A]\) to will result in . For two sets of freshness constraints \(\theta \) and \(\theta '\), and substitutions \(\delta \) and \(\delta '\), writing \(\theta ' \vdash \delta (\theta )\) means that \(\theta ' \vdash A \# \delta (X)\) holds for all \((A \# X) \in \theta \), and \(\theta \vdash \delta =\delta '\) means that \(\theta \vdash \delta (X)=\delta '(X)\) for all \(X\in \textit{dom}(\delta )\cup \textit{dom}(\delta ')\).

The definitions of unification, most general unifiers and idempotent unifiers are similar to the ones in nominal unification [14]. A unification problem P is a finite set of equations over hypergraph \(\lambda \)-terms and freshness constraints. Each equation \(M = N\) may contain unknown variables \(X,Y, \ldots \ \). A solution of P is a unifier denoted as \((\theta ,\delta )\), consisting of a set \(\theta \) of freshness constraints and a substitution \(\delta \). A unifier \((\theta ,\delta )\) of a problem P equates every equation in P, i.e., establishes \(\theta \vdash \delta (M) = \delta (N)\). \(\mathcal {U}(P)\) denotes the set of unifiers of a problem P. For P, a unifier \((\theta ,\delta ) \in \mathcal {U}(P)\) is a most general unifier if for any unifier \((\theta ',\delta ') \in \mathcal {U}(P)\), there is a substitution \(\delta ''\) such that \(\theta ' \vdash \delta ''(\theta )\) and \(\theta ' \vdash \delta '' \circ \delta = \delta ' \). A unifier \((\theta ,\delta ) \in \mathcal {U}(P)\) is idempotent if \(\theta \vdash \delta \circ \delta = \delta \).

The unification algorithm is described in Fig. 5, where P is a given unification problem and \(\delta \) is a substitution which is usually initialized to \(\varepsilon \). Each rule arbitrarily selects an equation or a freshness constraint from P and transforms it accordingly. The rule =abs transforms an equation and creates two freshness constraints, where all freshness constraints we need are obtained. That is why the rule =rm simply deletes an equation without creating any freshness constraints. The rule =var creates a substitution \(\delta '\) from an equation (if \(X \notin M\)), applies \(\delta '\) to P and adds \(\delta '\) to \(\delta \). The rules in Fig. 5 essentially correspond to the rules in Fig. 4 except for the rule =var. The next lemma justifies the rule =var.

Lemma 8

Substitution generated by the rule =var in Fig. 5 preserves \(=\) and \(\#\) obtained by applying rules in Fig. 4. That is,

  1. (a)

    If \(\theta ' \vdash \delta (\theta )\) and \(\theta \vdash M = N\) hold, then \(\theta ' \vdash \delta (M) = \delta (N)\) holds.

  2. (b)

    If \(\theta ' \vdash \delta (\theta )\) and \(\theta \vdash A \,\# \,M\) hold, then \(\theta ' \vdash A \, \# \, \delta (M)\) holds.

Proof

The proof of both is by structural induction. (a) We only show the case of abstraction. Assume \(M=\texttt {abs(A,X)}\), \(N=\texttt {abs(B,Y)}\), \(\delta = [\texttt {X}:=P_1, \texttt {Y}:=P_2]\). Then we have \(\theta = \{ \texttt {A} \# \texttt {Y}, \texttt {B} \# \texttt {X} \}\), \(\theta \subseteq \theta '\), \(\texttt {A}\mathord {\#}P_2 \), and \(\texttt {B}\mathord {\#}P_1\). From \(\theta \vdash M = N\), we have \(\texttt {X} = [\texttt {B} \leftrightarrow \texttt {A}]Y\). Using \(\texttt {A} \# P_2\) and \(\texttt {B} \# P_1\), and by the induction hypothesis, \(P_1 = [\texttt {B} \leftrightarrow \texttt {A}]P_2\) holds. Therefore, \(\theta ' \vdash \delta (\texttt {abs(A,X)}) = \delta (\texttt {abs(B,Y)})\) holds. (b) The proof is by structural induction.     \(\square \)

Terms in the hypergraph approach and the nominal approach are first-order terms without built-in \(\beta \)-reduction. To represent bound variables, the nominal approach uses concrete names and the hypergraph approach uses hyperlinks which are identified by names when writing hypergraph terms as text. Our unification and nominal unification both assume \(\alpha \)-equality for terms. Therefore, it is not surprising that our unification algorithm happens to be similar to the nominal unification algorithm. Nevertheless, there are differences. Our algorithm does not have a rule for handling two abstractions with the same bound variable. Also, the rule =rm is different from the \(\approx ?\)-suspension rule in nominal unification [14]. This is because Lemma 7 is different from its counterpart in nominal unification: the former states the freshness of every variable of \(\pi @ \pi '\) and the latter states the freshness of the variables in the disagreement set of \(\pi \) and \(\pi '\).

Theorem 2

For a given unification problem P, the unification algorithm in Fig. 5 either fails if P has no unifier or successfully produces an idempotent most general unifier.

Proof

Given in Appendix with related lemmas. The structure of the proof in [14] applies to our case basically, though our formalization allows the interleaving of the = and # rules of the algorithm.     \(\square \)

4 Examples of the Unification

We apply the unification algorithm in Fig. 5 to three unification problems.

Example 1

A unification problem

$$ \texttt {abs(A,abs(B,}X\texttt {))}=\texttt {abs(C,abs(D,}X\texttt {))} $$

has a solution.

figure c

The problem has the most general unifier (\(\{\texttt {A}\#X, \texttt {C}\#X, \texttt {B}\#X, \texttt {D}\#X\}\), \(\varepsilon \)), which says that X can be any term not containing A, B, C or D.

Example 2

A unification problem

$$ \texttt {abs(A,abs(B,app(}X\texttt {,B)))}=\texttt {abs(C,abs(D,app(D,}X\texttt {)))} $$

has no solution.

figure d

The problem is unsolvable; it fails due to both \(\texttt {B} = \texttt {D}\) and B#B.

Example 3

A unification problem

has no solution.

figure e

The problem is unsolvable; it fails due to \(\texttt {A} \# \texttt {A}\).

5 Implementation

We implemented the unification of hypergraph \(\lambda \)-terms in HyperLMNtal in a straightforward mannerFootnote 1. There are a total of 52 rewrite rules in the implementation; 12 rewrite rules corresponding to the 9 rules in Fig. 5 (4 rules for the =var rule), 14 rules for the occur-check, 7 rules for implementing applying swapping to terms, 7 rewrite rules for substitution, and several auxiliary rules for list management. Interestingly, the implementation of substitution \(M[X := N]\) turned out to be essentially the same as that for the \(\lambda \)-calculus, i.e., in Fig. 2. The implementation solved a number of unification problems, including the examples in this paper. HyperLMNtal brought simplicity in the sense that the rewrite rules of the implementation are extremely close to the unification rules discussed in this paper.

6 Related Work and Conclusion

Complexity of formalizing unification over terms containing name binding is largely determined by the approach taken for representing such terms. There are two prominent unification algorithms: higher-order pattern unification [8] and nominal unification [14].

A higher-order approach implements a variant of the \(\lambda \)-calculus as a meta-language, which is used to encode formal systems involving name binding [9]. The meta-language implicitly handles substitution and implicitly restricts bound variables to be distinct. Users reason about formal systems indirectly through the meta-language, in which terms are higher-order terms. Higher-order pattern unification unifies equations of terms modulo \(=_{\alpha \beta _0 \eta }\). It finds functions to substitute unknown variables, which means that variable capture never happens. The characteristics of higher-order pattern unification are the result of letting the meta-language handle everything implicitly. In the nominal approach, bound-able names are equipped with swapping and freshness to ensure correct substitutions [4]. Users reason on formal systems through nominal terms which are first-order terms. As the result, nominal unification solves equations of terms modulo \(=_{\alpha }\), because \(=_{\beta \eta }\) is not needed for first-order terms, and allows for variable capture in the unification while preserving \(\alpha \)-equivalence. We believe that having no restrictions on bound variables is the cause of somewhat complex proofs in the nominal unification. One observation is that using a higher-order meta-language implicitly ensures the distinctness of bound variables in the higher-order approach. In the nominal approach, such restriction on bound variables does not exist.

Our approach uses hyperlinks to represent variables, hypergraphs to represent terms and hlground followed by hypergraph copying to avoid variable capture. Unlike the nominal approach, we use fresh hyperlinks whenever needed and hlground manages hyperlinks. In our approach, it is natural to restrict a hyperlink to be bound only once and every abstraction is syntactically unique. Just like nominal unification, our unification only considers \(\alpha \)-equivalence and allows variable capture in the unification. The key idea of our technique is that implementing \(\alpha \)-renaming (as the copying of hypergraphs identified by hlground) leads to the simplification of overall reasoning. Urban pointed out that the proofs of nominal unification in [14] are clunky and presented simpler proofs in [15]. Proofs in this paper are even somewhat simpler than the proofs in [15]. In our unification algorithm, the basic properties are easy to establish; Lemmas 456 and 7 are intuitive and simple. In particular, we proved equivalence relation (Theorem 1) without much efforts.

To conclude, we worked on the unification of hypergraph \(\lambda \)-terms and the result shows that our approach has taken the promising strategy as indicated by simple proofs of fundamental properties needed for the unification algorithm. We successfully implemented the unification algorithm in HyperLMNtal. This work suggests that our hypergraph rewriting framework provides a convenient platform to work with formal systems involving name bindings and unification of their terms. In the future, we plan to use this unification algorithm to encode type inferences of formal systems involving name binding. Besides, it should be interesting to reformalize logic programming languages such as \(\alpha \)Prolog [3] using our hypergraph-based approach and implement them in HyperLMNtal to see how much simplicity our approach can provide in practice.