Skip to main content

Order-sorted equational generalization algorithm revisited

Abstract

Generalization, also called anti-unification, is the dual of unification. A generalizer of two terms t and \(t^{\prime }\) is a term \(t^{\prime \prime }\) of which t and \(t^{\prime }\) are substitution instances. The dual of most general equational unifiers is that of least general equational generalizers, i.e., most specific anti-instances modulo equations. In a previous work, we extended the classical untyped generalization algorithm to: (1) an order-sorted typed setting with sorts, subsorts, and subtype polymorphism; (2) work modulo equational theories, where function symbols can obey any combination of associativity, commutativity, and identity axioms (including the empty set of such axioms); and (3) the combination of both, which results in a modular, order-sorted equational generalization algorithm. However, Cerna and Kutsia showed that our algorithm is generally incomplete for the case of identity axioms and a counterexample was given. Furthermore, they proved that, in theories with two identity elements or more, generalization with identity axioms is generally nullary, yet it is finitary for both the linear and one-unital fragments, i.e., either solutions with repeated variables are disregarded or the considered theories are restricted to having just one function symbol with an identity or unit element. In this work, we show how we can easily extend our original inference system to cope with the non-linear fragment and identify a more general class than one–unit theories where generalization with identity axioms is finitary.

This is a preview of subscription content, access via your institution.

References

  1. Alpuente, M., Escobar, S., Meseguer, J., Ojeda, P.: A modular equational generalization algorithm. In: Hanus, M. (ed.) Logic-based program synthesis and transformation, 18th international symposium, LOPSTR 2008, Valencia, Spain, July 17-18, 2008, Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol. 5438, pp 24–39 (2008), https://doi.org/10.1007/978-3-642-00515-2_3

  2. Alpuente, M., Escobar, S., Meseguer, J., Ojeda, P.: Order-sorted generalization. Electr Notes Theor Comput Sci 246, 27–38 (2009). https://doi.org/10.1016/j.entcs.2009.07.013

    Article  MATH  Google Scholar 

  3. Alpuente, M., Escobar, S., Espert, J., Meseguer, J.: ACUOS: A system for modular ACU generalization with subtyping and inheritance. In: Proceedings of the 14th European Conference on Logics in Artificial Intelligence (JELIA 2014), Springer-Verlag, Berlin, Lecture Notes in Computer Science, vol. 8761, pp 573–581 (2014)

  4. Alpuente, M., Escobar, S., Espert, J., Meseguer, J.: A modular order-sorted equational generalization algorithm. Inf. Comput. 235, 98–136 (2014). https://doi.org/10.1016/j.ic.2014.01.006

    MathSciNet  Article  MATH  Google Scholar 

  5. Alpuente, M., Ballis, D., Cuenca-Ortega, A., Escobar, S., Meseguer, J.: ACUOS2: a high-performance system for modular ACU generalization with subtyping and inheritance. In: Proceedings of the 16th European Conference on Logics in Artificial Intelligence (JELIA 2019), Springer-Verlag, Berlin, Lecture Notes in Computer Science, vol. 11468, pp 171–181 (2019)

  6. Alpuente, M., Ballis, D., Sapiña, J.: Efficient safety enforcement for Maude programs via program specialization in the ÁTAME system. Math. Comput. Sci. 14(3), 591–606 (2020)

    MathSciNet  Article  Google Scholar 

  7. Alpuente, M., Cuenca-Ortega, A., Escobar, S., Meseguer, J.: A partial evaluation framework for order-sorted equational programs modulo axioms. 110: 1–36 (2020)

  8. Armengol, E.: Usages of generalization in case-based reasoning. In: Proceedings of the 7th International Conference on Case-Based Reasoning (ICCBR 2007), Springer-Verlag, Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-540-74141-1_3, vol. 4626, pp 31–45 (2007)

  9. Baumgartner, A., Kutsia, T., Levy, J., Villaret, M.: Term-graph anti-unification. In: FSCD, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, LIPIcs, vol. 108, pp 9:1–9:17 (2018)

  10. Cerna, D.M., Kutsia, T.: Unital anti-unification: Type and algorithms. In: Ariola, Z.M. (ed.) 5th International Conference on Formal Structures for Computation and Deduction, FSCD 2020, June 29-July 6, 2020, Paris, France (Virtual Conference), Schloss Dagstuhl - Leibniz-Zentrum für Informatik, LIPIcs, vol. 167, pp 26:1–26:20 (2020), https://doi.org/10.4230/LIPIcs.FSCD.2020.26

  11. Durán, F., Lucas, S., Meseguer, J.: Termination modulo combinations of equational theories. In: Ghilardi, S., Sebastiani, R. (eds.) FroCos, Springer, Lecture Notes in Computer Science, vol. 5749, pp 246–262 (2009)

  12. Escobar, S., Sasse, R., Meseguer, J.: Folding variant narrowing and optimal variant termination. J. Log. Algebr. Program 81(7-8), 898–928 (2012)

    MathSciNet  Article  Google Scholar 

  13. Goguen, J., Meseguer, J.: Order-sorted algebra I: Equational deduction for multiple inheritance, overloading, exceptions and partial operations. Theor. Comput. Sci. 105, 217–273 (1992)

    MathSciNet  Article  Google Scholar 

  14. Huet, G.: Resolution d’equations dans des langages d’order 1, 2,…,ω. PhD thesis, Univ, Paris VII (1976)

  15. Meseguer, J.: Membership algebra as a logical framework for equational specification. In: Parisi-Presicce, F (ed.) Proceedings of 12th International Workshop on Recent Trends in Algebraic Development Techniques, WADT’97, Springer, LNCS, vol. 1376, pp 18–61 (1997)

  16. Muggleton, S.: Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic. Artif. Intell. 114(1-2), 283–296 (1999)

    Article  Google Scholar 

  17. Ontañón, S., Plaza, E.: Similarity measures over refinement graphs. Mach. Learn. 87(1), 57–92 (2012). https://doi.org/10.1007/s10994-011-5274-3

    MathSciNet  Article  MATH  Google Scholar 

  18. Plotkin, G.: A note on inductive generalization. In: Machine Intelligence, vol. 5, pp 153–163. Edinburgh University Press (1970)

  19. Reynolds, J.: Transformational systems and the algebraic structure of atomic formulas. Mach. Intell. 5, 135–151 (1970)

    MathSciNet  MATH  Google Scholar 

  20. TeReSe (ed.): Term Rewriting Systems. Cambridge University Press, Cambridge (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santiago Escobar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs of technical results

Appendix A: Proofs of technical results

A.1 Proof of Theorem 1

Theorem 1 (Termination)

Given a kind-completed, B-pre-regular, order-sorted equational theory (Σ,B) with a set B of ACU axioms, and a generalization problem \({\varGamma }= {t}\overset {x{:}\mathsf {[s]}}{\triangleq }{t^{\prime }}\), with \(\mathsf {[s]}=[\textit {LS}(t)]=[\textit {LS}(t^{\prime })]\), such that the subsignature ΣΓ is U-tolerant, every derivation stemming from an initial configuration 〈Γid〉 using the inference rules of Figs. 1, 2, 3, 4, 5 and 6, terminates in a final configuration of the form 〈S𝜃〉.

Proof

Identical to the proof of [4] since it is not possible to have a generalization problem of the form \({e_{f} \overset {w}{\triangleq } e_{g}}\) in a U-tolerant signature. □

A.2 An Auxiliary least general generalization procedure

In order to prove correctness and completeness (Theorems 2 and 3, respectively, in Appendix A) of the order-sorted, equational least general generalization procedure, we follow the same proof schema of [4] and define order-sorted B-lgg computation by subsort specialization. In other words, we mimick the computation of least general generalizers by first removing sorts (i.e., upgrading variables to top sorts), then computing (unsorted) B-lggs, and finally obtaining the right subsorts by a suitable specialization post-processing.

First, we recall the notion of a conflict pair.

Definition 3 (Conflict Position/Pair)

Given terms t and \(t^{\prime }\), a position \(p \in {\mathcal {P}}os(t)\cap {\mathcal {P}}os(t^{\prime })\) is called a conflict position of t and \(t^{\prime }\) if \(root(t|_{p})\neq root(t^{\prime }|_{p})\) and for all q < p, \(root(t|_{q}) = root(t^{\prime }|_{q})\). Given terms t and \(t^{\prime }\), the pair (u,v) is called a conflict pair of t and \(t^{\prime }\) if there exists at least one conflict position p of t and \(t^{\prime }\) such that u = t|p and \(v=t^{\prime }|_{p}\).

The following notions of pair of subterms and conflict pair are specialized to the case when function symbols obey C, A, AC, and U and are the basis for our overall proof scheme.

Definition 4 (Commutative Pair of Subterms)

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or commutative, the pair (u,v) of terms is called a commutative pair of subterms of t and \(t^{\prime }\) if and only if there are positions pPos(t) and \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that:

  • t|p = u, \(t^{\prime }|_{p^{\prime }}=v\), \(\mathit {depth}(p)=\mathit {depth}(p^{\prime })\),

  • for each 0 ≤ i < depth(p), \(root(t|_{p|_{i}}) = root(t^{\prime }|_{p^{\prime }|_{i}})\), and

  • for each 0 < jdepth(p):

    • if \(root(t|_{p|_{j-1}})\) is free, then \((p)_{j} = (p^{\prime })_{j}\), and

    • if \(root(t|_{p|_{j-1}})\) is commutative, \((p)_{j} = (p^{\prime })_{j}\) or \((p)_{j}= ((p^{\prime })_{j}\ \textit {mod}\ 2)+1\).

Definition 5 (Commutative Conflict Pair)

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or commutative, the pair (u,v) is called a commutative conflict pair of t and \(t^{\prime }\) if and only if root(u)≠root(v) and (u,v) is a commutative pair of subterms of t and \(t^{\prime }\).

Definition 6 (Associative Pair of Positions)

Given flattened terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or associative, and given positions pPost and \(p^{\prime }\in \mathit {Pos}{t^{\prime }}\), the pair \((p,p^{\prime })\) of positions is called an associative pair of positions of t and \(t^{\prime }\) if and only if

  • \(\mathit {depth}(p)=\mathit {depth}(p^{\prime })\),

  • for each 0 ≤ i < depth(p), \(root(t|_{p|_{i}}) = root(t^{\prime }|_{p^{\prime }|_{i}})\), and

  • for each 0 < jdepth(p):

    • if \(root(t|_{p|_{j-1}})\) is free, then \((p)_{j} = (p^{\prime })_{j}\), and

    • if \(root(t|_{p|_{j-1}})\) is associative, then no restriction on (p)j and \((p^{\prime })_{j}\).

Definition 7 (Associative Pair of Subterms)

Given flattened terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or associative, the pair (u,v) of terms is called an associative pair of subterms of t and \(t^{\prime }\) if and only if either

  1. 1.

    (Regular subterms) for each pair of positions pPos(t) and \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that t|p = u, \(t^{\prime }|_{p^{\prime }}=v\), then \((p,p^{\prime })\) is an associative pair of positions of t and \(t^{\prime }\); or

  2. 2.

    (Associative subterms) there are positions pPos(t), \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that the following conditions are satisfied:

    • \((p,p^{\prime })\) is an associative pair of positions of t and \(t^{\prime }\),

    • \(u=f(u_{1},\ldots ,u_{n_{u}})\), nu ≥ 1, \(v=f(v_{1},\ldots ,v_{n_{v}})\), nv ≥ 1, f is associative,

    • \(t|_{p}=f(t_{1},\ldots , t_{k_{1}}, u_{1},\ldots , u_{n_{u}},t_{k_{2}},\ldots , t_{n_{p}})\), np ≥ 2, \(t^{\prime }|_{p^{\prime }}=f(t^{\prime }_{1},\ldots , t^{\prime }_{k^{\prime }_{1}},v_{1},\ldots , v_{n_{v}},t^{\prime }_{k^{\prime }_{2}},\ldots , t^{\prime }_{n_{p^{\prime }}})\), np ≥ 2, and

    • k1 = 0 (no arguments before u1) if and only if \(k^{\prime }_{1} = 0\) (no arguments before v1), and,

    • k2 > np (no arguments after \(u_{n_{u}}\)) if and only if \(k^{\prime }_{2} > n_{p^{\prime }}\) (no arguments after \(v_{n_{v}}\)).

Definition 8 (Associative Conflict Pair)

Given flattened terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or associative, the pair (u,v) is called an associative conflict pair of t and \(t^{\prime }\) if and only if root(u)≠root(v) and (u,v) is an associative pair of subterms of t and \(t^{\prime }\).

Definition 9 (Associative-commutative Pair of Positions)

Given flattened terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or associative-commutative, and given positions pPost and \(p^{\prime }\in \mathit {Pos}{t^{\prime }}\), the pair \((p,p^{\prime })\) of positions is called an associative-commutatve pair of positions of t and \(t^{\prime }\) if it satisfies the conditions for being an associative pair of positions of t and \(t^{\prime }\).

Definition 10 (Associative-commutative Pair of Subterms)

Given flattened terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or associative-commutative, the pair (u,v) of terms is called an associative-commutative pair of subterms of t and \(t^{\prime }\) if and only if either

  1. 1.

    (Regular subterms) for each pair of positions pPos(t) and \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that t|p = u, \(t^{\prime }|_{p^{\prime }}=v\), then \((p,p^{\prime })\) is an associative-commutative pair of positions of t and \(t^{\prime }\); or

  2. 2.

    (Associative-commutative subterms) there are positions pPos(t), \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that the following conditions are satisfied:

    • \((p,p^{\prime })\) is an associative-commutative pair of positions of t and \(t^{\prime }\), and

    • \(u=f(u_{1},\ldots ,u_{n_{u}})\), nu ≥ 1, \(v=f(v_{1},\ldots ,v_{n_{v}})\), nv ≥ 1, f is associative,

    • \(t|_{p}=f(t_{1},\ldots ,t_{n_{p}})\), np ≥ 2, \(t^{\prime }|_{p^{\prime }}=f(t^{\prime }_{1},\ldots ,t^{\prime }_{n_{p^{\prime }}})\), \(n_{p^{\prime }} \geq 2\),

    • for all 1 ≤ inu, there exists 1 ≤ jnp s.t. ui =Btj, and

    • for all 1 ≤ inv, there exists \(1\leq j \leq n_{p^{\prime }}\) s.t. \(v_{i} {=}_{B} t^{\prime }_{j}\).

Definition 11 (Associative-commutative Conflict Pair)

Given flattened terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or associative-commutative, the pair (u,v) is called an associative-commutative conflict pair of t and \(t^{\prime }\) if and only if root(u)≠root(v) and (u,v) is an associative-commutative pair of subterms of t and \(t^{\prime }\).

Definition 12 (Identity Pair of Positions)

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or has an identity, and given positions pPost and \(p^{\prime }\in Pos{t^{\prime }}\), the pair \((p,p^{\prime })\) of positions is called an identity pair of positions of t and \(t^{\prime }\) if and only if either

  1. 1.

    (Base case) p = Λ and \(p^{\prime }={\Lambda }\);

  2. 2.

    (Free symbol) p = q.i, \(p^{\prime }=q^{\prime }.i\), \(root(t^{\prime }|_{q^{\prime }})=root(t|_{q})\) is a free symbol, and \((q,q^{\prime })\) is an identity pair of positions of t and \(t^{\prime }\);

  3. 3.

    (Identity on the left) \(\mathit {depth}(p)>\mathit {depth}(p^{\prime })\), p = q.i, root(t|q) has an identity symbol e, and \((q,p^{\prime })\) is an identity pair of positions of t and \(t^{\prime }\); or

  4. 4.

    (Identity on the right) \(\mathit {depth}(p^{\prime })>\mathit {depth}(p)\), \(p^{\prime }=q^{\prime }.i\), \(root(t^{\prime }|_{q^{\prime }})\) has an identity symbol e, and \((p,q^{\prime })\) is an identity pair of positions of t and \(t^{\prime }\).

Definition 13 (Identity Pair of Subterms)

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or has an identity, the pair (u,v) of terms is called an identity pair of subterms of t and \(t^{\prime }\) if and only if for each pair of positions pPos(t) and \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that t|p = u, \(t^{\prime }|_{p^{\prime }}=v\), then \((p,p^{\prime })\) is an identity pair of positions of t and \(t^{\prime }\).

Definition 14 (Identity Conflict Pair)

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or has an identity, the pair (u,v) is called an identity conflict pair of t and \(t^{\prime }\) if and only if root(u)≠root(v) and (u,v) is an identity pair of subterms of t and \(t^{\prime }\).

We also recall a special notation for subterm replacement when we have associative or associative-commutative conflict pairs and order-sorted information.

Definition 15 (A-Subterm Replacement)

Given two flattened terms t and \(t^{\prime }\) and an associative conflict pair (u,v) with a pair of conflict positions pPos(t) and \(p^{\prime }\in \mathit {Pos}(t^{\prime })\) such that u = f(u1,…,um), m ≥ 1, v = f(v1,…,vn), n ≥ 1, f is associative, \(t|_{p}=f(w_{1},\ldots , w_{k_{1}}, u_{1},\ldots , u_{m}, w^{\prime }_{1},\ldots , w^{\prime }_{k_{2}})\), k1 ≥ 0, k2 ≥ 0, and \(t^{\prime }|_{p^{\prime }}=f(w^{\prime \prime }_{1},\ldots , w^{\prime \prime }_{k_{3}}, v_{1},\ldots , v_{n}, w^{\prime \prime \prime }_{1},\ldots , w^{\prime \prime \prime }_{k_{4}})\), k3 ≥ 0, k4 ≥ 0, we write t[[x:s]]p and \(t^{\prime }[[x{:}{\mathsf {s}}]]_{p^{\prime }}\) to denote the terms \(t[[x{:}{\mathsf {s}}]]_{p}= t[f(w_{1},\ldots , w_{k_{1}}, x{:}{\mathsf {s}}, w^{\prime }_{1},\ldots , w^{\prime }_{k_{2}})]_{p}\), and \(t^{\prime }[[x{:}{\mathsf {s}}]]_{p^{\prime }}= t[f(w^{\prime \prime }_{1},\ldots , w^{\prime \prime }_{k_{3}}, x{:}{\mathsf {s}}, w^{\prime \prime \prime }_{1},\ldots , w^{\prime \prime \prime }_{k_{4}})]_{p^{\prime }}\).

Definition 16 (AC-Subterm Replacement)

Given two flattened terms t and \(t^{\prime }\) and an associative-commutative conflict pair (u,v) with a pair of conflict positions pPost and \(p^{\prime }\in \mathit {Pos}{t^{\prime }}\) such that u = f(u1,…,um), m ≥ 1, v = f(v1,…,vn), n ≥ 1, f is associative-commutative, \(t|_{p}=f(w_1,\ldots ,w_{k_1})\) s.t. for each i ∈{1,…,m}, there is j ∈{1,…,k1} with ui=Bwj, and \(t^{\prime }|_{p^{\prime }}=f(w^{\prime }_1,\ldots ,w^{\prime }_{k_2})\) s.t. for each i ∈{1,…,n}, there is j ∈{1,…,k2} with \(v_{i} {=}_{B} w^{\prime }_j\), then we write t[[x:s]]p and \(t^{\prime }[[x{:}\mathsf {s}]]_{p^{\prime }}\) to denote the terms \(t[[x{:}\mathsf {s}]]_{p}= t[f(\overline {w_1},\ldots ,\overline {w_{k_1}}, x{:}\mathsf {s})]_{p}\) where \(\{\overline {w_1},\ldots ,\overline {w_{k_1}}\}= \bigcup \{w_i \mid 1\leq i\leq k_1, \forall 1\leq j \leq n, w_i \not =_{B} u_j\}\), and \(t^{\prime }[[x{:}\mathsf {s}]]_{p^{\prime }}= t[f(\overline {w^{\prime }_1},\ldots ,\overline {w^{\prime }_{k_2}}, x{:}\mathsf {s})]_{p^{\prime }}\) where \(\{\overline {w^{\prime }_1},\ldots ,\overline {w^{\prime }_{k_2}}\}= \bigcup \{w^{\prime }_i \mid 1\leq i\leq k_2, \forall 1\leq j \leq n, w^{\prime }_i \not =_{B} v_j\}\).

Note that B-pre-regularity is essential here because it ensures that after replacing a subterm by a variable, the least sort does not depend on the chosen representation within the equivalence class of a term, i.e., it does not depend on how the flattened version of the term is obtained.

We recall order-sorted B-lgg computation by subsort specialization using top-sorted generalization and sort-specialized generalization.

Definition 17 (Top-sorted Equational Generalization)

Given a kind-completed, B-pre-regular, order-sorted equational theory (Σ,B) with a set B of ACU axioms, and flattened Σ-terms t and \(t^{\prime }\) such that \([\textit {LS}(t)]=[\textit {LS}(t^{\prime })]\), let (u1,v1),…, (uk,vk) be the B-conflict pairs of t and \(t^{\prime }\), and for each such conflict pair (ui,vi), let \(({p^{i}_{1}},\ldots ,p^{i}_{n_{i}}, {q^{i}_{1}},\ldots ,{q}^{i}_{n_{i}})\), 1 ≤ ik, be the corresponding B-conflict positions, and let [si] = [LS(ui)] = [LS(vi)]. We define the term denoting the top order-sorted equational least general generalization as

$$\textit{tsg}_{E}(t,t^{\prime}) =t[[{x^{1}_{1}}{:}{\mathsf{[s_{1}]}},\ldots,x^{1}_{n_{1}}{:}{\mathsf{[s_{1}]}}]]_{{p^{1}_{1}},\ldots,p^{1}_{n_{1}}} {\cdots} [[{x^{k}_{1}}{:}{\mathsf{[s_{k}]}},\ldots,x^{k}_{n_{k}}{:}{\mathsf{[s_{k}]}}]]_{{p^{k}_{1}},\ldots,p^{k}_{n_{k}}}$$

where \({x^{1}_{1}}{:}\mathsf {[s_{1}]},\ldots ,x^{1}_{n_{1}}{:}\mathsf {[s_{1}]},\ldots ,{x^{k}_{1}}{:}\mathsf {[s_{k}]},\ldots ,x^{k}_{n_{k}}{:}\mathsf {[s_{k}]}\) are fresh variables.

The order-sorted equational lggs are obtained by subsort specialization.

Definition 18 (Sort-specialized Equational Generalization)

Given a kind-completed, B-pre-regular, order-sorted equational theory (Σ,B) with a set B of ACU axioms, and flattened Σ-terms t and \(t^{\prime }\) such that \([{LS}(t)]=[{LS}(t^{\prime })]\), let (u1,v1),…, (uk,vk) be the conflict pairs of t and \(t^{\prime }\), and for each such conflict pair (ui,vi), let \({p^{i}_{1}},\ldots ,p^{i}_{n_{i}}\), 1 ≤ ik, be the corresponding B-conflict positions, let [si] = [LS(ui)] = [LS(vi)], and let \({x^{1}_{1}}{:}{\mathsf {[s_{1}]}},\ldots ,x^{1}_{n_{1}}{:}{\mathsf {[s_{1}]}},\ldots ,{x^{k}_{1}}{:}{\mathsf {[s_{k}]}},\ldots ,x^{k}_{n_{k}}{:}{\mathsf {[s_{k}]}}\) be the variable identifiers used in Definition 17. We define

$$ \begin{array}{ll} \textit{sort-down-subs}_{E}(t,t^{\prime}) = \{\rho \mid & \mathit{Dom}(\rho)=\{{x^{1}_{1}}{:}{\mathsf{[s_{1}]}},\ldots,x^{1}_{n_{1}}{:}{\mathsf{[s_{1}]}},\ldots,{x^{k}_{1}}{:}{\mathsf{[s_{k}]}},\ldots,x^{k}_{n_{k}}{:}{\mathsf{[s_{k}]}}\} \ \wedge\ \\ & \forall 1\leq i\leq k, \forall 1\leq j\leq n_{i}:\\ & ({x^{i}_{j}}{:}{\mathsf{[s_{i}]}})\rho=x_{i}{:}{\mathsf{s^{\prime}_{i}}} \wedge {\mathsf{s^{\prime}_{i}}}\in\textit{LUBS}(\textit{LS}(u_{i}),\textit{LS}(v_{i})) \} \end{array} $$

where all the \(x_{i}{:}{\mathsf {s^{\prime }_{i}}}\) are fresh variables. The set of sort-specialized B-generalizers is defined as \(\textit {ssg}_{E}(t,t^{\prime }) =\{\textit {tsg}_{E}(t,t^{\prime })\rho \mid \rho \in \textit {sort-down-subs}_{E}(t,t^{\prime })\}\).

A.3 Proof of Theorems 2 and 3

The auxiliary notions and results in this section are similar to the corresponding ones in [4], although the proofs of some of the results were just sketched there and we have completed them.

Let us prove that the range of the substitutions partially computed at any stage of a generalization derivation coincides with the set of the index variables of the configuration, except for the generalization variable x of the original generalization problem \({t}\overset {x}{\triangleq }{t^{\prime }}\). This is stated in the following lemma that is similar to Lemma 28 of [4].

Lemma 2 (Range of Substitutions)

Given terms t and \(t^{\prime }\) and a fresh variable x such that \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle \to ^{*} \langle C \mid S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, 3, 5, 6 and 7, then \(\textit {Index}(S \cup C)\subseteq VRan({\theta })\cup \{x\}\), and V Ran(𝜃) = Var(x𝜃).

Proof

Immediate by construction. □

The following lemma establishes an auxiliary property that is useful for formulating the notion of an identity conflict pair of terms. It is similar to Lemma 30 of [4].

Lemma 3

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or has an identity, and a fresh variable x, then there is a sequence \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle \to ^{*} \langle {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge C \mid S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, and 3 such that there is no variable z such that \({u \overset {z}{\scriptstyle {\triangleq }} v}\in S\) if and only if (u,v) is an identity pair of subterms of t and \(t^{\prime }\).

Proof

Straightforward by successive application of the inference rule DecomposeB of Fig. 1 and the inference rule ExpandU of Fig. 2. □

The following lemma expresses the precise connection between the constraints in a derivation and the identity conflict pairs of the initial configuration. It is similar to Lemma 31 of [4].

Lemma 4

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or has an identity, and a fresh variable x, then there is a sequence \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle {\rightarrow ^{*}} \langle C \mid {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, and 3 if and only if (u,v) is an identity conflict pair of t and \(t^{\prime }\).

Proof

(⇒) Since \(\langle {t \overset {x}{{\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle {\rightarrow ^{*}} \langle C \mid {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge S \mid \theta \rangle \), then there must be two configurations \(\langle {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge C_{1} \mid S_{1} \mid \theta _{1} \rangle \), \(\langle C_{2} \mid {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge S_{2} \mid \theta _{2} \rangle \) such that

$$\langle {t \overset{x}{\scriptstyle{\triangleq}} t^{\prime}} \mid \emptyset \mid \textit{id}\ \rangle \to^{*} \langle {u \overset{y}{\scriptstyle{\triangleq}} v} \wedge C_{1} \mid S_{1} \mid \theta_{1} \rangle,$$
$$\langle {u \overset{y}{\scriptstyle{\triangleq}} v} \wedge C_{1} \mid S_{1} \mid \theta_{1} \rangle \to_{{\text{\textbf{Solve}}}_{B}} \langle C_{2} \mid {u \overset{y}{\scriptstyle{\triangleq}} v} \wedge S_{2} \mid \theta_{2} \rangle,$$
$$\langle C_{2} \mid {u \overset{y}{\scriptstyle{\triangleq}} v} \wedge S_{2} \mid \theta_{2} \rangle \to^{*} \langle \emptyset \mid {u \overset{y}{\scriptstyle{\triangleq}} v} \wedge S \mid \theta \rangle,$$

and, by application of the inference rule SolveB, root(u)≠root(v). By using Lemma 3 with the derivation \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id}\ \rangle \to ^{*} \langle {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge C_{1} \mid S_{1} \mid \theta _{1} \rangle ,\)\((u,v^{\prime })\) is an identity pair of subterms of t and \(t^{\prime }\). Therefore, (u,v) is an identity conflict pair.

(⇐) By Lemma 3, there is a configuration \(\langle {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge C_{1} \mid S_{1} \mid \theta _{1} \rangle \) such that \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id}\ \rangle \to ^{*} \langle {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge C_{1} \mid S_{1} \mid \theta _{1} \rangle \), and root(u)≠root(v). Then, the inference rule SolveB is applied, i.e.,\(\langle {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge C_{1} \mid S_{1} \mid \theta _{1} \rangle \to \langle C_{1} \mid {u \overset {y}{\scriptstyle {\triangleq }} v} \wedge S_{1} \mid \theta _{1} \rangle \) and \({u \overset {y}{\scriptstyle {\triangleq }} v}\) will be part of S in the final configuration 〈S𝜃〉. □

Finally, the following lemma establishes the link between the computed substitution and a proper generalizer. It is similar to the proof of Lemma 32 of [4]. We have underlined the beginning of the extra necessary cases that allow us to repair the original proof of [4].

Lemma 1

Given terms t and \(t^{\prime }\) such that every symbol in t and \(t^{\prime }\) is either free or has an identity, and a fresh variable x,

  • if \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle {\rightarrow ^{*}} \langle C \mid S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, and 3, then x𝜃 is a generalizer of t and \(t^{\prime }\) modulo identity;

  • if u is a generalizer of t and \(t^{\prime }\) modulo identity, then there is a derivation \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle {\rightarrow ^{*}} \langle C \mid S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, and 3, such that uBx𝜃.

Proof

By structural induction on the term x𝜃 (resp. u). If x𝜃 = x (resp. u is a variable), then 𝜃 = id and the conclusion follows. If x𝜃 = f(u1,…,uk) (resp. u = f(u1,…,uk)) and f is free, then the inference rule DecomposeB of Fig. 1 is applied and we have that t = f(t1,…,tk) and \(t^{\prime }=f(t^{\prime }_{1},\ldots ,t^{\prime }_{k})\). If f has an identity symbol e and x𝜃 = f(u1,u2) (resp. u = f(u1,u2)), then we have two possibilities: (1) the inference rule ExpandU of Fig. 2 is applied and we have that either: (i) t = f(t1,t2) and \(t^{\prime }=f(t^{\prime }_{1},t^{\prime }_{2})\); (ii) t = f(t1,t2) and \(root(t^{\prime })\neq f\); or (iii) root(t)≠f and \(t^{\prime }=f(t^{\prime }_{1},t^{\prime }_{2})\). (2) the inference rule RecoverU of Fig. 3 is applied and we have that root(t)≠f and \(root(t^{\prime })\neq f\).

For the case when f is free, by using the induction hypothesis, ui is a generalizer of ti and \(t^{\prime }_{i}\), for each i.

For the case when f has an identity symbol e and the inference rule ExpandU was applied, by using the induction hypothesis, u1 is a generalizer of either t1 and \(t^{\prime }_{1}\), t1 and \(t^{\prime }\) (by applying\(f(x,e) \doteq x\) to \(t^{\prime }\)), or t and \(t^{\prime }_{1}\) (by applying\(f(x,e) \doteq x\) to t). Similarly, u2 is a generalizer of either t2 and \(t^{\prime }_{2}\), t2 and \(t^{\prime }\) (by applying\(f(e,x) \doteq x\) to \(t^{\prime }\)), or t and \(t^{\prime }_{2}\) (by applying\(f(e,x) \doteq x\) to t).

For the case when f has an identity symbol e and the inference rule RecoverU was applied, by using the induction hypothesis, either u1 is a generalizer of t and e and u2 is a generalizer of \(t^{\prime }\) and e, or u1 is a generalizer of \(t^{\prime }\) and e and u2 is a generalizer of t and e. Now, if for each pair of terms in u1,…,uk there are no shared variables, then the conclusion follows. Otherwise, for each variable z shared between two different terms ui and uj, there is a constraint \({w_{1}}\overset {z}{\triangleq }{w_{2}} \in S\), and, by Lemma 4, there is an identity conflict pair (w1,w2) in ti and \(t^{\prime }_{i}\). Thus, the conclusion follows. □

Correctness and completeness are finally proved as follows.

Theorem 2 (Correctness)

Given a kind-completed, B-pre-regular, order-sorted equational theory (Σ,B) with a set B of ACU axioms, and a generalization problem \({\varGamma }={t}\overset {{x{:}\mathsf {[s]}}}{\triangleq }{t^{\prime }}\), with \(\mathsf {[s]}=[\textit {LS}(t)]=[\textit {LS}(t^{\prime })]\), such that t and \(t^{\prime }\) are flattened Σ-terms and the subsignature ΣΓ is U-tolerant, if \(\langle {t \overset {x{:}{\mathsf {[s]}}}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle {\rightarrow ^{*}} \langle \emptyset \mid S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, 3, 5, 6 and 7, then (x:[s])𝜃 is a generalizer of t and \(t^{\prime }\).

Proof

By Lemma 1. □

Theorem 3 (Completeness)

Given a kind-completed, B-pre-regular, order-sorted equational theory (Σ,B) with a set B of ACU axioms, and a generalization problem \({\varGamma }={t}\overset {x{:}\mathsf {[s]}}{\triangleq }{t^{\prime }}\), with \(\mathsf {[s]}=[\textit {LS}(t)]=[\textit {LS}(t^{\prime })]\), such that t and \(t^{\prime }\) are flattened Σ-terms and the subsignature ΣΓ is U-tolerant, if u is a least general generalizer of t and \(t^{\prime }\) modulo B, then there is a derivation \(\langle {t \overset {x{:}{\mathsf {[s]}}}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id} \rangle {\rightarrow ^{*}} \langle C \mid S \mid \theta \rangle \) using the inference rules of Figs. 1, 2, 3, 5, 6 and 7 such that uB(x:[s])𝜃.

Proof

By contradiction. Consider a derivation \(\langle {t \overset {x}{\scriptstyle {\triangleq }} t^{\prime }} \mid \emptyset \mid \textit {id}\rangle \to ^{*} \langle \emptyset \mid S \mid \theta \rangle \) such that x𝜃 is not a least general generalizer of t and \(t^{\prime }\) up to renaming. Since x𝜃 is a generalizer of t and \(t^{\prime }\) by Lemma 1, there is a substitution ρ which is not a variable renaming such that x𝜃ρ=Bu. By Lemma 2, V Ran(𝜃) = Var(x𝜃); hence, we can choose ρ with Dom(ρ) = Var(x𝜃). Now, since ρ is not a variable renaming, either:

  1. 1.

    there are variables \(y,y^{\prime }\in \mathit {Var}(x\theta )\) and a variable z such that \(y\rho =y^{\prime }\rho =z\), or

  2. 2.

    there is a variable yVar(x𝜃) and a non-variable term v such that yρ = v.

In case (1), there are two conflict positions \(p,p^{\prime }\) for t and \(t^{\prime }\) such that \(u|_{p}=z=u|_{p^{\prime }}\) and x𝜃|p = y and \(x\theta |_{p^{\prime }}=y^{\prime }\). Specifically, this means that \(t|_{p}=t|_{p^{\prime }}\) and \(t^{\prime }|_{p}=t^{\prime }|_{p^{\prime }}\). However, this is impossible by Lemmas 4 and 2. In case (2), there is a position p such that x𝜃|p = y and p is neither a conflict position of t and \(t^{\prime }\) nor is it under a conflict position of t and \(t^{\prime }\). Since this is impossible by Lemmas 4 and 2, the claim is proved. □

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Alpuente, M., Escobar, S., Meseguer, J. et al. Order-sorted equational generalization algorithm revisited. Ann Math Artif Intell 90, 499–522 (2022). https://doi.org/10.1007/s10472-021-09771-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-021-09771-1

Keywords

  • Least general generalization
  • Rule–based languages
  • Equational reasoning
  • Order-Sorted
  • Associativity
  • Commutativity
  • Identity

Mathematics Subject Classification (2010)

  • 68N17
  • 68N18
  • 68Q42
  • 68Q60
  • 68T20
  • 68W30