Functions-as-constructors Higher-order Unification

Uniﬁcation is a central operation in the construction of a range of computational logic systems based on ﬁrst-order and higher-order logics. First-order uniﬁcation has a number of properties that dominates the way it is incorporated within such systems. In particular, ﬁrst-order uniﬁ-cation is decidable, unary, and can be performed on untyped term structures. None of these three properties hold for full higher-order uniﬁcation: uniﬁcation is undecidable, uniﬁers can be incomparable, and term-level typing can dominate the search for uniﬁers. The so-called pattern subset of higher-order uniﬁcation was designed to be a small extension to ﬁrst-order uniﬁcation that respected the basic laws governing λ -binding (the equalities of α , β , and η -conversion) but which also satisﬁed those three properties. While the pattern fragment of higher-order uniﬁcation has been popular in various implemented systems and in various theoretical considerations, it is too weak for a number of applications. In this paper, we deﬁne an extension of pattern uniﬁca-tion that is motivated by some existing applications and which satisﬁes these three properties. The main idea behind this extension is that the arguments to a higher-order, free variable can be more than just distinct bound variables: they can also be terms constructed from (suﬃcient numbers of) such variables using term constructors and where no argument is a subterm of any other argument. We show that this extension to pattern uniﬁcation satisﬁes the three properties mentioned above.


Introduction
Unification is the process of solving equality constraints by the computation of substitutions.This process is used in computational logic systems ranging from automated theorem provers, proof assistants, type inference systems, and logic programming.The first-order unification-that is, unification restricted to first-order terms-enjoys at least three important computational properties, namely, (1) decidability, (2) determinacy, and (3) type-freeness.These properties of unification shaped the way it can be used within computational logic systems.The first two of these properties ensures that unification-as a process-will either fail to find a unifier for a given set of disagreement pairs or will succeed and return the most general unifier that solves all those disagreement pairs.The notion of type-freeness simply means that unification can be done independently of the possible typing discipline that might be employed with terms.Thus, first-order unification can be performed on untyped first-order terms (as terms are usually considered in, say, Prolog).This property is important since it means that unification can be used with any typing discipline that might be adopted.Since typing is usually an open-ended design issue in many languages (consider, for example, higher-order types, subtypes, dependent types, parametric types, linear types, etc.), the type-freeness of unification makes it possible for it to be applied to a range of typing disciplines.
Of course, many syntactic objects are not most naturally considered as purely first-order terms: this is the case when that syntax contains bindings.Instead, many systems have adopted the approach used by Church in his Simple Theory of Types [11] where terms and term equality comes directly from the λ-calculus.All binding operations-quantification in first-order formulas, function arguments in functional programs, local variables, etc.-can be represented using the sole binder of the λ-calculus.Early papers showed that secondorder pattern matching could be used to support interesting program analysis and program transformation [18] and that a higher-order version of Prolog could be used to do more general manipulations of programs and formulas [21].Today, there is a rich collection of computational logic systems that have moved beyond first-order term unification and rely on some form of higher-order unification.These include the theorem provers TPS [3], Leo [7] and Satallax [10]; the proof assistants Isabelle [26], Coq [36], Matita [4], Minlog [31], Agda [9], Abella [5], and Beluga [29]; the logic programming languages λProlog [23] and Twelf [28]; and various application domains such as natural language processing [12].
The integration of full higher-order unification into computational logic systems is not as simple as it is in first-order systems since the three properties mentioned above do not hold.The unification of simply typed λ-terms is undecidable [15,16] and there can be incomparable unifiers, implying that no most general unifiers exist in the general situation.Also, types matter a great deal in determining the search space of unifiers.For example, let i and j be primitive types, let a be a constant of type i, and let F and X be variables of type α → i and α, respectively, where α is a type variable.Consider the unification problem (F X) = a.If we set α to j, then there is an mgu for this problem, namely [F → λw.a].If we set α to i, then there are two incomparable solutions, namely [F → λw.a] and [F → λw.w, X → a].If we set α to i → i, then there is an infinite number of incomparable solutions: [F → λf.a] and, for each natural number n, [F → λf.f n a, X → λw.w].If higher order values for α are considered, the possibility of unifiers becomes dizzying.
For these reasons, the integration of unification for simply typed λ-terms into computational logic systems is complex: most such integration efforts attempt to accommodate the (pre-)unification search procedure of Huet [17].
Instead of moving from first-order unification to full higher-order unification, it is possible to move to an intermediate space of unification problems.Given that higher-order unification is undecidable, there is an infinite number of decidable classes that one could consider.The setting of higher-order pattern unification (proposed in [20] and called L λ -unification there) could be seen as the weakest extension of first-order unification in which the equations of α, β, and η conversion hold.In this fragment, a free variable cannot be applied to general terms but only to bound variables that cannot appear free in eventual instantiations for the free variable.This restriction means that all forms of β-reduction encountered during unification are actually (α-equivalent) to the rule (λx.B)x = B (a conversion rule called β 0 ).Notice that in this setting, β-reduction reduces the size of terms: hence, unification takes on a much simpler nature.The unification problems that result retain all three properties we listed for first-order unification [20].As a result, the integration of pattern unification into a prover is usually much simpler than incorporating all the search behavior implied by Huet's (pre-)unification procedure.
A somewhat surprising fact about pattern unification is that many computational logic systems actually need only this subset of higher-order unification in order to be "practically complete": that is, restricting unification to just this subset did not stop the bulk of specifications from being properly executed.For example, while both early implementations of λProlog and LF [24,28] implemented full higher-order unification, the most recent versions of those languages implement only pattern unification [1,30].A design feature of both of those systems is to treat any unification problem that is not in the pattern fragment as a suspended constraint: usually, subsequent substitutions will cause such delayed problems to convert into the pattern fragment.Processing of constraints may also be possible: the application of pruning to flexible-flexible constraints in [22] is such an example.Also since pattern unification does not require typing information, it has been possible to describe variants of such unification in settings where types can play a role during unification: see for example, generalizations of pattern unification for dependent and polymorphic types [27], product types [13,14], and sum types [2].
Since pattern unification is a weak fragment of higher-order unification, it is natural to ask if it can be extended and still keep the same high-level properties.There have been extensions of pattern unification considered in the literature.The generalization (mentioned above) of pattern unification to patterns by Fettig and Löchner [14] and Duggan [13] allows for constructors denoting projections to be admitted in the scope of free functional variables.These projections are specific unary functions which are closed under a number of properties, such as associativity.When attempting to encode the meta-theory of sequent calculus in which eigenvariables are seen as abstractions over sequents [19], a single bound variable was intended to be used as a list of bound (eigen)variables.Thus, in order to encode the sequent judgment x 0 , . . ., x n Cx 0 . . .x n (for n ≥ 0 and all variables being of the same primitive type) one would instead use the simply typed term λl.C(fst l) . . .(fst(snd n l)), where the environment abstraction l has type, say, evs, and fst and snd are constructors of type evs → i and evs → evs, respectively.Tiu showed how to lift pattern unification to this setting [33].The Coq proof assistant allows for some limited forms of unification and many simple unification problems can appear that should be automatically solved.A typical such example is of the form λx.Y (gx) .= λx.f(gx), where Y is a free variable of type i → i and f and g are constructors of the type i → i.Clearly, this problem has the mgu Y → λz.f z but it falls outside the pattern restriction.There are certain uses of Coq (for example, with the bigop library of SSReflect) which produce a number of non-pattern unification problems. 1et us return to the definition of pattern unification problems.The restriction on occurrences of the free variable, say, M is that (1) it can be applied only to variables that cannot appear free in terms that are used to instantiate M and (2) that those arguments are distinct.Condition (1) essentially says that the arguments of M form a primitive pattern that allows one to form an abstraction to solve a unification problem.Thus, M x y can equal, say, (s x) + y simply by forming the abstraction λxλy.(sx) + y.Condition (2) implies that such abstracts are unique.
The examples of needing richer unification problems above illustrate that it is also natural to consider arguments built using variables and term constructors: that is, we should consider generalizing condition (1) above by allowing the application λl(M (fst l) (fst (snd l))).If this application is required to unify with a term of the form λl.t then all occurrence of l in t must occur in subterms of the form (fst l) or (fst (snd l)).In that case, forming an abstraction of t by replacing all occurrences of (fst l) and (fst (snd l)) with separate bound variables gives a solution to this unification problem.To guarantee uniqueness of such solutions, we shall also generalize condition (2) so that the arguments of M cannot be subterms of each other.This additional constraint is required here (but not in the papers by Duggan [13] and Tiu [33]) since we wish to handle richer signatures than just those with monadic constraints.
Many of the examples leading to this generalization of pattern unification arise in situations where operators (such as fst and snd) are really functions and not constructors: the intended meaning of those two operators are as functions that map lists to either their first element or to their tail.When they arise in unification problems, however, we can only expect to treat them as constructors.Thus, we shall name this extended pattern unification as function-as-constructor (pattern) unification, or just FCU for short.The rest of this paper is structured as follows.We cover the basic concepts related to higher-order unification in Section 2. The class of unification problems addressed in this paper, the functions-as-constructor class, is defined in Section 3 as is a unification algorithm for that class.We prove the correctness of that algorithm in Section 4. We conclude in Section 5.

The Lambda-Calculus
In this section we will present the logical language that will be used throughout the paper.
The language is a version of Church's simple theory of types [11] with an η-conversion rule as presented in [6] and [32] and with implicit α-conversions.Unless stated otherwise, all terms are implicitly converted into β-normal and η-expanded form.Most of the definitions in this section are adapted from [32].Let T 0 be a set of basic types, then the set of types T is generated by T := T 0 | T → T. Let C be a signature of function symbols and let V be a countably infinite set of variable symbols.Variables are normally denoted by the letters l, x, y, w, z, X, Y, W, Z and function symbols by the letters f, g, h, k, a or typed names like cons .W e sometimes use subscripts and superscripts as well.We sometimes add a superscript to symbols in order to specify their type.The set Term α of terms of type α is generated by Applications throughout the paper will be associated to the left.We will sometimes omit brackets when the meaning is clear.We will also normally omit typing information when it is not crucial for the correctness of the results.τ (t α ) = α refers to the type of a term.The set Term denotes the set of all terms.Subterms and positions are defined as usual.We denote the fact that t is a (strict) subterm of s using the infix binary symbol ( ) .Sizes of positions denote the length of the path to the position.We denote the subterm of t at position p by t| p .Bound and free variables are defined as usual.We will use the convention of denoting bound and universally quantified variables by lower letters while existentially quantified variables will be denoted by capital letters.Given a term t, we denote by hd(t) its head symbol and distinguish between flex terms, whose head is a free variable and rigid terms, whose head is a function symbol or a bound variable.
We will use both set union (∪) and disjoint set union ( ) in the text.Substitutions and their composition (•) are defined as usual.Namely, (σ • θ)X = θ(σX).id denotes the trivial substitution mapping each variable to itself.We denote by σ| W the substitution obtained from substitution σ by restricting its domain to variables in W .We denote by σ[X → t] the substitution obtained from σ by mapping X to t, where X might already exist in the domain of σ.We extend the application of substitutions to terms in the usual way and denote it by postfix notation.Variable capture is avoided by implicitly renaming variables to fresh names upon binding.A substitution σ is more general than a substitution θ, denoted σ ≤ θ, if there is a substitution δ such that σ • δ = θ.The domain of a substitution σ is denoted by dom(σ).
We introduce also a vector notation t n for the sequence of terms t 1 , . . ., t n .This notation also holds for nesting of sequences.For example, the term f (X will be denoted by f X 3 z 2 .The meaning of the notation λz n is λz 1 , . . ., λz n .When the order of the sequence is not important, we will use this notation also for multisets.

Higher-order Pre-unification
In this section we present Huet's pre-unification procedure [17] as defined in [32].The procedure will be proven, in Section 3.2, to be deterministic for the class of FCU problems.This result, together with the completeness of the procedure, implies the existence of mostgeneral unifiers for unifiable problems of this class.The presentation in this paper of both the pattern and FCU unification algorithms is much simplified if the following non-standard normal form is being used.All terms, including functional existential variables but excluding the arguments of these variables, are considered to be in η-expanded form.The arguments of these variables are expected to be in η-normal forms.In a similar manner to the one in [32], one can prove that all substitutions used in this paper preserve this normal form.Definition 2 (Unification System).A unification system over a signature C is the following quadruple Q ∃ , Q ∀ , S, σ where Q ∃ and Q ∀ are disjoint sets of variables, S is a set of equations and σ a substitution.Given a unification problem ∃X m .e 1 ∧. ..∧e n we consider the unification system over signature C by setting An important property of terms in which we will be interested later is the subterm property between terms of different equations.Such a property is not closed under α-renaming of bound variables for our current definition of unification systems.Since we implicitly assume such renaming in order to avoid variables capture, we have to add the following additional requirement.The next lemma is easily proven.

Lemma 5. The subterm property is closed under α-renaming of bound variables for regular unification systems.
Regular unification systems will also be called systems.Before presenting Huet's procedure for pre-unification, we will repeat the definition of partial bindings as given in [32].
Partial bindings fall into two categories, imitation bindings, which for a given atom a and type α, are denoted by PB(a, α) and projection bindings, which for a given index 0 < i ≤ n and a type α, are denoted by PB(i, α) and in which the atom a is equal to the bound variable y i .Since partial bindings are uniquely determined by an index, a type and an atom (up to renaming of the fresh variables X m ), this defines a particular term.
Definition 7 (Huet's Pre-unification Procedure).Huet's pre-unification procedure is given in Table 1.Note that the sets Q ∃ and Q ∀ are fixed during the execution and are mentioned explicitly just for compatibility with the algorithms given later in the paper.
The next theorem states the completeness of this procedure.Theorem 8 ([32]).Given a system Q ∃ , Q ∀ , S, id and assuming it is unifiable by σ, then there is a sequence of rule applications in Def.7 resulting in Q ∃ , Q ∀ , ∅, θ such that θ ≤ σ.

Pattern Unification
In this section we describe the higher-order pattern unification algorithm in [20].The notation used is similar to the one in [25].This algorithm forms the basis for our algorithm.Definition 9 (Pattern Systems).A system Q ∃ , Q ∀ , S, σ is called a pattern system if for all equations e i ∈ S and for all subterms Xz n in these equations such that X ∈ Q ∃ we have that z n ⊆ Q ∀ ∪ bvars(e i ) and z i = z j for all 0 < i < j ≤ n.
The following simplification will be called during the execution of the algorithm given in Def.11.

Definition 10 (Pruning). Given a pattern system
Definition 11 (Pattern Unification Algorithm).The pattern unification algorithm is the application of the rules from Table 2 such that before the application of rules (3) and ( 5) we apply exhaustively pruning.
Paper [20] contains a proof that the algorithm from Def. 11 is terminating, sound and complete. where where Table 2 Pattern Unification Algorithm

3
A Unification Algorithm for FC Higher-order Unification Problems

FC Higher-order Unification (FCU) Problems
The main difference between pattern and FCU problems is in the form of arguments of existentially quantified variables.While in pattern unification problems, these arguments must be a list of distinct universally quantified variables which occur in the scope of the existentially quantified one, we relax this requirement for FCU problems.This relaxation still ensures the existence of mgus if the problems are unifiable.
Definition 12 (Restricted Terms).Given C, Q ∀ and an equation e, a restricted term in e is defined inductively as follows: and t i is a restricted term for all 0 < i ≤ n.When e is clear from the context, we will refer to these terms just as restricted terms.
We will use examples over C = {cons , fst , snd , nil} and Q ∀ = {l, z} to explain the definition and algorithms presented in the paper.
Example 13.The terms l, (cons z, l ), (cons (fst l) l) and (snd (cons z l)) are restricted terms over the above C and Q ∀ , while nil and (cons z nil) are not.

Definition 14 (FCU Systems).
A system Q ∃ , Q ∀ , S, σ is an FCU system if the following three conditions are satisfied: argument restriction -for all occurrences Xt n in S where X ∈ Q ∃ , t i for all 0 < i ≤ n is a restricted term.local restriction -for all occurrences Xt n in S where X ∈ Q ∃ and for each t i and t j such that 0 < i, j ≤ n and i = j, t i t j .global restriction -for each two different occurrences Xt n and Y s m in S where X, Y ∈ Q ∃ and for each 0 < i ≤ n and 0 < j ≤ m, t i s j .It is important to note that when dealing with restricted unification problems, the global restriction from above cannot be violated by occurrences within different equations.The name global only refers, therefore, to the context of one equation.
The next proposition is easy to verify.
Proposition 16.Pattern systems are FCU systems.
Before going on to show the properties of these problems, we would like to present a short discussion about the motivation behind the restrictions above.The three restrictions are required in order to maintain uniqueness of the result and will be used in the next section in order to prove the determinacy of Huet's procedure over FCU problems.Nevertheless, we do not prove that this result does not hold when weakening the above restrictions.The local restriction and global restriction can easily be shown to be required even for very simple examples.This is not the case for the argument restriction.One alternative is to weaken the restricted term definition from above to require only one subterm in the second condition to be restricted.I.e. to allow terms such as (cons z nil) as arguments of existential variables.In the following, we will give a counter-example to this weaker restriction.Still, it should be noted that the counter-example depends on allowing inductive definitions containing more than one base case (in particular, we allow for different empty list constructors nil 1 and nil 2 ).When such definitions are not allowed, it may be possible to prove the results given in this paper for a stronger class of problems.

The Existence of Most-general Unifiers
From this section on, an FCU problem will be referred to simply as system, unless indicated otherwise.
In [32] it is claimed that the only "don't-know" non-determinism in the general higherorder procedure stems from the choice between the different applications of (Imitate) and (Project).We prove that fulfilling the three restrictions in Def. 14 makes these choices deterministic.
We first prove a couple of auxiliary lemmas.
Lemma 18. let t be a restricted term, s a term containing the subterm Xr n and σ a substitution such that t = sσ, then there is a restricted term t such that t = Xr n σ.
Proof.Let k be the length of the position of Xr n in s, we prove by induction on k. k = 0, then t = t.k > 0, then s = f s m and f s m σ = f t m = t.Since t is restricted, by definition so are t 1 , . . ., t m .Assume Xr n occurs in s i , then, according to the inductive hypothesis, there is a restricted term t such that t = Xr n σ.
Lemma 19.Given a unifiable equation Xt n .= r, where r is a restricted term.Then, there is 0 < i ≤ n such that t i is a subterm of r.
Proof.Assume the contrary and let σ be the unifier.Then, Xt n σ = r.By definition, r contains a symbol a ∈ Q ∀ and we get a contradiction.

Lemma 20. Let t = t t k and s
Proof.Since t is restricted, it does not contain abstractions and variables and as t is unifiable with s, it can be written as f v n−k .Since t is restricted, all its subterms are restricted as well.
The next two lemmas prove the determinism claim on applications of (Project) and (Imitate).

Lemma 21. Given the equation Xt
= f s m where X does not occur in f s m and assuming we can obtain the following two equations by applying the substitutions and Then, there are no substitutions σ and θ such that σ unifies equation 1 and θ unifies equation 2.
Proof.Assume the existence of the two unifiers and obtain a contradiction.According to Lemma 20, we can rewrite the two equations as and for restricted terms v m−l and u m−k .Assume, wlog, that l ≥ k.Note also, that since t i = t j and t i , t j have f as head symbol, m ≥ m − k > 0. We consider two cases: s 1 , . . ., s m−k are all ground terms.In this case and since both equations are unifiable, we get the equation Clearly, k = l since otherwise t i = t j which violates the local restriction from Def. 14.We can now conclude that Since u m−l+1 is a restricted term then, according to Lemma 19, there is 0 < k 1 ≤ n such that t k1 is a subterm of u m−l+1 .Since u m−l+1 is a subterm of t j , we get that t k1 is a subterm of t j which contradicts the local restriction.
There is s k1 for 0 < k 1 ≤ m − k which contains an occurrence of Zr k2 .This must occur as a subterm of s k1 as otherwise the subterm Zr k2 r where r is not a restricted term, violates the argument restriction.Since s k1 θ = u k1 and u k1 is a restricted term, we have, according to Lemma 18, that there exist a restricted term u such that Zr k2 θ = u .Using Lemma 19, we can conclude that there is 0 < k 3 ≤ k 2 such that r k3 is a subterm of u , which is a subterm of u k1 which is a strict subterm of t j , which violates the global restriction.= f s m where X does not occur in f s m and assuming we can obtain the following two equations by applying the substitutions and Then, there are no substitutions σ and θ such that σ unifies equation 7 and θ unifies equation 8.
Proof.Assume the existence of the two unifiers and obtain a contradiction.Using Lemma 20, we can rewrite Eq. 8 as: where v 1 , . . ., v m−k are restricted terms and strict subterms of t j .Since f is imitated, it is not a restricted term and f = t j which implies that m − k > 0. Eq. 9 tells us that v 1 = s 1 θ which implies that s 1 θ is a strict subterm of t j and a restricted term.On the other hand, we have that X 1 t n = s 1 σ from Eq. 7. We consider two cases: s 1 is ground.In this case we can use Lemma 19 and the fact that s 1 is a restricted term to conclude that there is 0 < k 1 ≤ n such that t k1 is a subterm of s 1 .On the other hand, we know that s 1 is a strict subterm of t j and therefore we get that t k1 is a strict subterm of t j , which contradicts the local restriction..If s 1 is not ground, it must contain an occurrence Zr l .This occurrence cannot occur as the subterm Zr l r where r is not a restricted term as it violates the argument restriction.Therefore, Zr l is a subterm of s 1 .Since s 1 θ = v 1 and since v 1 is a restricted term, we can use Lemma 18 to get that there is a restricted term v such that Zr l = v .Now we use Lemma 19 and get that there is 0 < k 1 ≤ l such that r k1 is a subterm of v , which is a strict subterm of t j .We get again a contradiction to the global restriction.
Theorem 23 (The existence of most-general unifiers).Given a unifiable FCU system S, then applying the procedure in Def.7 to S terminates and returns a most-general unifier for S.
Proof.The procedure in Def.7 computes complete sets of unifiers and terminates with an element in this set [32].Using the lemmas 21 and 22 we obtain that all transformations are deterministic.Therefore, the complete set contains only one element, which is the most-general unifier of S.

The Unification Algorithm
For defining the unification algorithm, we need to slightly extend the definition of pruning.

Definition 24 (Covers).
A cover for Xt n and a restricted term q is a substitution σ such that Xt n σ = q.
Note, uniqueness of covers follows from Theorem 23 Example 25.The following substitution [X → λz 1 λz 2 .cons(fst z 1 ) z 2 ] is a cover for (X l z) and (cons (fst l) z).Definition 26 (Pruning).Given an FCU system Q ∃ , Q ∀ , S, σ such that (Xt n .= r) ∈ S and r contains an occurrence of a maximal restricted term q such that q ∈ t n : if there is a subterm W s m of r such that q = s i for some 0 < i ≤ m, then return else if there is no cover ρ for Xt n and q, then return Q ∃ , Q ∀ , ⊥, id .else, do nothing.
Example 27.Given the system {X, Y, W, Z}, {l, w, z}, {X (snd l) z .= Y z (fst l), W (fst l) z .= snd (Z w (fst l))}, id , we can apply the following three prunings, For the next definition, we will use the following replacement operator r | tn zn to denote the replacement of each occurrence t i in r with z i for 0 < i ≤ n.Definition 28 (Algorithm for FCU Systems).The rules of an algorithm for the unification of FCU systems is given in Table 3 where before the application of rules ( 3) and ( 5) we apply exhaustively pruning. where where where X = Y , θ = [Y → λz m .Xz φ(m) ] and φ is a permutation (see Lemma 36) such that φ(j) = i if t i = s j for 0 < i ≤ n and 0 < j ≤ m Table 3 An algorithm for FCU problems Example 29.The following problem is contained in one of the classes of problems discussed in the introduction: ∃X∃Y λl 1 λl 2 .X (fst l 1 ) (fst (snd l 1 )) .= λl 1 λl 2 .snd(Y (fst l 2 ) (fst l 1 )) Table 4 gives a full execution of the algorithm on it.
Table 4 An example of a reduction on an FCU Notice that this algorithm can also work with terms that are essentially untyped: it is the presence or absence of constructors and bound variables that matters in this algorithm and not the types of those constructors and variables.Rich typing can, of course, be used to disallow unifiers that are created by considering terms to be type-less.

Correctness of the Algorithm
The unification algorithm transforms systems by the application of substitutions and by the elimination of equations.We prove next that the application of rules of the algorithm in Def.28 on FCU problems results in FCU problems as well.
Lemma 30.Given an FCU problem, then the application of rules from Def. 28 results in ain FCU problem.
Proof.Removing equations from the system clearly preserves the restrictions of FCU problems.This result is also immediate when applying substitutions as the only change to the arguments of the variables in the problem is to eliminate them and we have already claimed that the subterm property is closed under α-renaming for these problems in Lemma 5.
The following lemma states that projected arguments of variables on one side of the equation must always match arguments on the other side.Lemma 31.Let Xt n .= r be an equation such that r contains an occurrence of Y s m where r = Y s m and let σ be a unifier of this equation such that σY = λz m .s.Then, for each occurrence z i in s for 0 < i ≤ m, there is 0 < j ≤ n such that s i = t j .
Proof.We prove by induction on the number of occurrences.If s does not contain such occurrence, then the lemma clearly holds.Assume s contains an occurrence z i for 0 < i ≤ m and that there is no 0 < j ≤ n such that s i = t j .In case there is more than one such occurrence in s, choose this occurrence to be in a minimal such subterm, i.e. z i occurs in a subterm z i v k such that all occurrences of z ∈ z m in v k fulfill the requirement that there is t j = z for some 0 < j ≤ n.Let λz m .zi v k (s m ) = s i v k .Since r = Y s m and the argument restriction, we have that Y s m r.Since Xt n σ = rσ, we get that s i v k Xt n σ.Since s i is a restricted term, we get that there is 0 < j ≤ n such that either s i v k t j .By the minimality assumption, if v k contains a restricted term, then it must be equal to some t l 0 < l ≤ n and therefore, that t l t j , which contradicts the local restriction.Therefore, since t j is a restricted term, k = 0. We obtain that s i t j and since s i = t j by assumption, we get, again, a contradiction to the global restriction.t j s i .Again, since s i = t j , we get a contraction to the global restriction.
We now prove, for each rule in the FCU algorithm, a relative completeness result.We start by the non-unifiability of problems with a positive occur check.
= f s m }, σ be a system such that X occurs in s m , then the system is not unifiable.
Proof.Assume it is unifiable by θ and θX = λz n .s.Consider two cases: s does not contain any occurrence of a variable z i for 0 < i ≤ n.Let #t be the number of occurrences of symbols from C in t.Then, #(Xt n θ) = #(θX) ≤ #(s m θ) < #(f s m θ) and we get a contradiction to Xt n θ = f s m θ.
In case s contains such an occurrence and let Xq n be the occurrence in s m .According to Lemma 31, we know that for all occurrences of z i in s for 0 < i ≤ n, there is an index 0 < j ≤ n such that t j = q i .Let ρ be the mapping between indices defined as above such that ρ(i) = j.Let r k be the set of indices 0 < i ≤ n which occur in s for some k ≤ n.
Let p be the non-trivial position of Xq n in f s m and let p be the maximal position of a z i in s for i ∈ r k .This means we have q i at position p • p in f s m θ and since t ρ(i) = q i Theorem 39 (Most-general unifier).Given a system Q ∃ , Q ∀ , S, id , if the algorithm defined in Def.28 terminates with system Q ∃ , Q ∀ , ∅, σ , then σ is an mgu of S.
Proof.Since the algorithm in Def.28 is deterministic, then we can use Theorem 38 in order to prove that σ is an mgu.
The next theorem is proved by simulating the algorithm in Def.28 using the procedure in Def. 7.
Theorem 40 (Soundness).Given a system Q ∃ , Q ∀ , S, id and assuming there is a sequence of rule applications in Def.28 resulting in Q ∃ , Q ∀ , ∅, θ , then θ is a unifier of S.
Proof.It is obvious we can simulate each of the rules (0), ( 1), ( 2) and (3) using the procedure in Def. 7. We get the required result by using Theorem 23.For rules (4), (5) and the first case of pruning, assume there is another substitution ρ such that ρ unifies the problem and ρ < θ.This can only happen if ρX = λz n .W r k such that r k ⊂ r k .Lemma 35 states that there is no unifier ω and a substitution γ such that ω = ρ • γ.Since the second case of pruning results in failure, we are done.

Conclusion
We have described an extension of pattern unification called function-as-constructor unification.Such unification problems typically show up in situations where functions are applied to bound variables and where such functions are treated as term constructors (at least during the process of unification).We have shown that the properties that make first-order and pattern unification desirable for implementation-decidability and the existence of mgus for unifiable pairs-also hold for this class of unification problems.We are planning an implementation within the Leo-III theorem prover [35] and we then plan to compare this approach to unification with the implementation of Huet's pre-unification algorithm available in Leo-III when exercised against the THF set of problems within TPTP [8].
Another possible extension of the work is to improve its complexity class.The current algorithm, like the one in [20] and first-order unification algorithms which are used in practice, is of an exponential complexity.We would like to follow the work of Qian [34] and prove that FCU is of linear complexity.

Example 15 .F S C D 2 0 1 6 <article-no>: 8 FC
Some examples of the equations of FCU systems are {X l z .= fst (snd l)} and {cons (X (fst l)) (snd l) .= snd (Y (fst l) (fst (snd l)))}.Note that only the first example is a pattern unification problem.Examples of non-FCU problems are {X (cons z nil) .= snd l} which violates the argument restriction, {X (fst l) l .= cons z l} which violates the Higher-order Unification local restriction and {X (fst l) .= snd (Y (cons (fst l) (snd l)))} which violates (only) the global restriction.
Definition 1 (Unification Problem).An equation is a formula t .= s where t and s are βη-normalized (see remark above) terms.A unification problem is a formula of the form ∃X m .e 1 ∧ . . .∧ e n where e i for 0 < i ≤ n is an equation.Sets of equations are always closed under symmetry, i.e. if t .
= s is in the set, then also s .= t.
).A regular unification system is a unification system in which each bound variable which is bound in a different context has a different name.