Logic program proportions

The purpose of this paper is to present a fresh idea on how symbolic learning might be realized via analogical reasoning. For this, we introduce directed analogical proportions between logic programs of the form"$P$ transforms into $Q$ as $R$ transforms into $S$"as a mechanism for deriving similar programs by analogy-making. The idea is to instantiate a fragment of a recently introduced abstract algebraic framework of analogical proportions in the domain of logic programming. Technically, we define proportions in terms of modularity where we derive abstract forms of concrete programs from a"known"source domain which can then be instantiated in an"unknown"target domain to obtain analogous programs. To this end, we introduce algebraic operations for syntactic logic program composition and concatenation. Interestingly, our work suggests a close relationship between modularity, generalization, and analogy which we believe should be explored further in the future. In a broader sense, this paper is a further step towards a mathematical theory of logic-based analogical reasoning and learning with potential applications to open AI-problems like commonsense reasoning and computational learning and creativity.


Introduction
This paper is a first step towards an answer to the following question: How can a computer "creatively" generate interesting logic programs from a collection of given ones by using analogical reasoning?For example, given a program for the addition of natural numbers, how can we systematically generate from it a program for "adding" lists using analogy?This question can be stated mathematically in the form of a proportional equation between programs as Nat : Plus :: List : X where Nat and List are programs for generating the natural numbers and lists, respectively, Plus is the arithmetical program for the addition of numbers, and X is a placeholder for a concrete program, the solution to the equation.That is, we are asking for a program X = PlusList which operates on lists and is analogous to the program Plus.A solution to this equation indeed yields a reasonable program for the "addition" of lists, namely the program for appending lists (see Examples 9 and 22).
To this end, in this paper we introduce directed analogical proportions between logic programs of the form "P transforms into Q as R transforms into S " -in symbols, P → Q : • R → Sas an instance of Antić's (2022) general framework of analogical proportions as a mechanism for constructing similar programs by analogy-making.The purpose of this paper is to present a fresh idea on how logic-based symbolic learning can be realized via analogical reasoning, not to give fullfledged solutions to real-world problems -it is the first paper in a promising direction, and hopefully not the last one.
In the literature, computational learning usually means learning from (a massive amount of) data.For example, in "deep learning" artificial neural networks (ANNs) extract abstract features from data sets (cf.Goodfellow, Bengio, & Courville, 2016;LeCun, Bengio, & Hinton, 2015) and, on the symbolic side, inductive logic programs (ILPs) are provided with positive and negative examples of the target concept to be learned (cf. Muggleton, 1991).Another characteristic feature of current machine learning systems is the focus on goal-oriented problem solving -a typical task of ANNs is the categorization of the input data (e.g., finding cats in images) and ILPs try to construct logic programs from given examples which partially encode the problem to be solved (e.g., adding numbers or sorting lists).
The emphasis in this paper is different as we believe that program generation is equally important to artificial intelligence -and may even be more important for artificial general intelligence than problem-solving -and deserves much more attention.This is Sir Michael Atiyah's (fields medalist) answer to the question of how he selects a problem to study: I think that presupposes an answer.I don't think that's the way I work at all.Some people may sit back and say, 'I want to solve this problem' and they sit down and say, 'How do I solve this problem?'I don't.I just move around in the mathematical waters, thinking about things, being curious, interested, talking to people, stirring up ideas; things emerge and I follow them up.Or I see something which connects up with something else I know about, and I try to put them together and things develop.I have practically never started off with any idea of what I'm going to be doing or where it's going to go.I'm interested in mathematics; I talk, I learn, I discuss and then interesting questions simply emerge.I have never started off with a particular goal, except the goal of understanding mathematics.(cf.Gowers, 2000).
The process Sir Michael Atiyah is describing is the generation of new knowledge by connecting existing knowledge in a novel way without any specific "goal" in mind and it is believed by many researchers that analogy-making is the core mechanism for doing so (e.g.Hofstadter & Sander, 2013).
In the framework presented in this paper, "program generation" means the construction of novel logic programs in an "unknown" target domain via analogical transfer -realized by directed logic program proportions via generalization and instantiation -from a "known" source domain.This approach is similar to ILP in that novel programs are derived from experience represented as knowledge bases consisting of "known" programs.However, it differs significantly from ILP on how novel programs are constructed from experience -while in ILP the construction is goal-oriented and thus guided by partial specifications in the form of given examples, in our setting programs are derived by analogy-making to similar programs (without the need for concrete examples).For instance, we may ask -by analogy to arithmetic -what it means to "multiply" two arbitrary lists (cf.Example 22) or to reverse "even" lists (cf.Examples 10 and 23).Here, contrary to ILP, we do not expect a supervisor to provide the system with examples explaining list "multiplication" or "evenness" of lists, but instead we assume that there are arithmetic programs operating on numbers (i.e.numerals) -programs defining multiplication and evenness of numbers -which we can transfer to the list domain.
Example 1. Imagine two domains, one consisting of numbers (or numerals) and the other made up of lists.We know from basic arithmetic what it means to add two numerals.Now suppose we want to transfer the concept of addition to the list domain.We can then askby analogy -the following question: What does it mean to "add" two lists?We can transform this question into the following directed analogical equation: In our framework, Nat, Plus, and List will be logic programs, and X will be a program variable standing for a program which is obtained from List as Plus is obtained from Nat.That is, solutions to (1) will be programs implementing 'addition of lists'.The idea is to derive an abstract form Plus(Z) as a generalization of the concrete program Plus such that Plus = Plus(Nat).
That is, we factor Plus into subprograms and generalize every instance of the subprogram Nat in Plus by a program variable Z.We can then instantiate the form Plus(Z) with List to obtain a plausible solution to (1), that is, a program for 'addition of lists': It is important to emphasize that in order to be able to decompose the program Plus with respect to the above algebraic operations so that Nat occurs as a factor, we need to introduce novel algebraic operations on logic programs (Section 3).We will return to this example, in a more formal manner, in Examples 9 and 22.
In (semi-)automatic programming (Czarnecki & Eisenecker, 2000),1 one usually wants to construct a program given some specification.This is complicated by the fact that writing the complete specification is as complex as writing the logic program itself (cf.Kowalski, 1984).In this paper, we therefore propose a differnt view on automatic programming motivated by Sir Atiyah's approach to mathematical research quoted above: Instead of trying to satisfy a (complete) specification, a teacher iteratively constructs a source program S in a "known" domain by constructing a sequence of programs S 1 → S 2 → . . .→ S n → S whose "limit" is S ; the student then tries to "copycat"2 the same proceses in an "unknown" target domain by constructing the sequence P 1 → P 2 → . . .P n → P such that By construction, the program P is then "analogous" to the source program S .Interestingly, our work suggests a close relationship between modularity, generalization, and analogy which we believe should be explored further in the future.
In a broader sense, this paper is a first step towards a mathematical theory of logic-based analogical reasoning and learning in knowledge representation and reasoning systems with potential applications to fundamental AI-problems like commonsense reasoning and computational learning and creativity.

Logic programs
In this section, we recall the syntax and semantics of logic programming by mainly following the lines of Apt (1990).
2.1.Syntax.An (unranked first-order) language L consists of a set Ps L of predicate symbols, a set F s L of function symbols, a set C s L of constant symbols, and a denumerable set V = {z 1 , z 2 , . ..} of variables.Terms and atoms are defined in the usual way.Substitutions and (most general) unifiers of terms and (sets of) atoms are defined as usual.
Let L be a language.A (Horn logic) program over L (or L-program) is a set of rules of the form where A 0 , . . ., A k are atoms over L. It will be convenient to define, for a rule r of the form (2), head(r) := {A 0 } and body(r) := {A 1 , . . ., A k }, extended to programs by head(P) := r∈P head(r) and body(P) := r∈P body(r).In this case, the size of r is k denoted by sz(r).A fact is a rule with empty body and a proper rule is a rule which is not a fact.We denote the facts and proper rules in P by f acts(P) and proper(P), respectively.
A program P is ground if it contains no variables and we denote the grounding of P which contains all ground instances of the rules in P by gnd(P).
We call any bijective substitution a renaming.The set of all variants of P is defined by variants(P) := The main predicate of a program is given by the name of the program in lower case letters if not specified otherwise.We will sometimes write P p to make the main predicate p in P explicit and we will occasionally write P( x) to indicate that P contains variables among x = x 1 , . . ., x n , n ≥ 1.We denote the program constructed from P p by replacing every occurrence of the predicate symbol p with q by P[p/q].Example 2. Later, we will be interested in the basic data structures of numerals, lists, and (binary) trees.The programs for generating numerals and lists are given by As is customary in logic programming, [ ] and [u | x] is syntactic sugar for nil and cons(u, x), respectively.The program for generating (binary) trees is given by T ree (u, x, y) For instance, the tree consisting of a root a, and two leafs b and c is symbolically represented as tree (t(a, t(b, void, void), t(c, void, void))).The skeleton of T ree is given by sk(T ree) = tree tree ← tree .(3) We will frequently refer to the programs above in the rest of the paper.

2.2.
Semantics.An interpretation is any set of ground atoms.We define the entailment relation, for every interpretation I, inductively as follows: • For a ground atom • For a set of ground atoms • For a ground rule r of the form (2), • Finally, for a ground program P, I | = P if I | = r holds for each rule r ∈ P.
In case I | = gnd(P), we call I a model of P. The set of all models of P has a least element with respect to set inclusion called the least model of P and denoted by LM(P).
We call a ground atom A a (logical) consequence of P -in symbols P | = A -if A is contained in the least model of P and we say that P and R are (logically) equivalent if LM(P) = LM(R).

Algebra of logic programs
Our framework of analogical proportions between logic programs will be built on top of an algebra of logic programs which allows us to decompose programs into simpler modules3 via algebraic operations on programs.For this, it will be useful to introduce in this section two novel algebraic operations for logic program composition (Section 3.1) and concatenation (Section 3.2).
In the rest of the paper, P and R denote logic programs over some joint unranked first-order language L.
3.1.Composition.The rule-like structure of logic programs induces naturally a compositional structure which allows us to decompose programs rule-wise.
We define the (sequential) composition of P and R by4 .
Roughly, we obtain the composition of P and R by resolving all body atoms in P with the 'matching' rule heads of R.This is illustrated in the next example, where we construct the even from the natural numbers via composition.
Example 3. Reconsider the program Nat of Example 2 generating the natural numbers.By composing the only proper rule in Nat with itself, we obtain Notice that this program, together with the single fact in Nat, generates the even numbers.Let us therefore define the program where We will come back to this program in Example 10.
The following example shows that, unfortunately, composition is not associative.
Let us compute ({r}P)R.We first compute {r}P by noting that the rule r has two body atoms and has therefore size 2, which means that there is only a single choice of subprogram S of P with two rules, namely S = P; this yields Next we compute ({r}P)R = {r}R by noting that now there are two possible S 1 , S 2 ⊆ 2 R with head(S 1 ) = head(S 2 ) = body(r), given by Let us now compute {r}(PR).We first compute PR.We have We easily obtain

Similar computations as above show
So in total we have To compute {r}(PR), we therefore see that there are four two rule subprograms S 1 , S 2 , S 4 , S 4 ⊆ 2 PR with b or c in their heads given by Hence, we have We have thus shown Remark 5. Notice that the least model of P • R is, in general, not obtained from the least models of its factors P and R in an obvious way.For example, the least models of P := {a ← b} and 3.2.Concatenation.In many cases, a program is the 'concatenation' of two or more simpler programs on an atomic level.A typical example is the program which is, roughly, the 'concatenation' of List in the first and Nat in the second argument modulo renaming of predicate symbols (cf.Example 2).This motivates the following definition.
We define the concatenation of P and R inductively as follows: (1) For atoms p( s) and p( t), we define extended to sets of atoms B and B ′ by (2) For rules r and r ′ with sk(r) = sk(r ′ ), we define (3) Finally, we define the concatenation of P and R by We will often write PR instead of P • R in case the operation is understood from the context.We can now formally deconcatenate the list program from above as which is equivalent to We will return to deconcatenations of this form in Section 4 (cf.Example 9).Theorem 6. Concatenation is associative.
Proof.We know that the concatenation of words is associative.From this we deduce From this, we deduce the associativity of rule concatenation: We have which finally implies the associativity of concatenation via Remark 7. The least model of P • R is not obtained from the least models of P and R in an obvious way.For example, we have Definition 8.The algebra of logic programs over L consists of all L-programs together with all unary and binary operations on programs introduced above, including composition, concatenation, union, and the least model operator.

Logic program forms
Recall from Example 1 that we wish to derive abstract generalizations of concrete programs, which can then be instantiated to obtain similar programs.We formalize this idea via logic program forms as follows.
In the rest of the paper, we assume that we are given program variables X, Y, Z, . . .as placeholders for concrete programs and an algebra of logic programs P over some fixed language L (cf.Definition 8).
A (logic program) form over P (or P-form) is any well-formed expression built up from L-programs, program variables, and all algebraic operations on programs from above including substitution.More precisely, P-forms are defined by the grammar where P ∈ P is an L-program, Z is a program variable, and σ is a substitution.We will denote forms by boldface letters.
Forms generalize logic programs and induce transformations on programs in the obvious way by replacing program variables with concrete programs.This means that we can interpret logic program forms as 'meta-terms' over the algebra of logic programs with programs as 'constants', program variables as variables, and algebraic operations on programs as 'function symbols'.This is illustrated in the following examples.
Example 9.The program Plus of Example 1 for the addition of numerals is given by .
Recall from Example 1 that we wish to derive a form Plus from Plus which abstractly represents addition.Notice that Plus is, essentially, the concatenation of the program Nat (Example 2) in the first and last argument together with a middle part.Formally, we have We therefore define the form Plus(Z q ( x)), where Z is a program variable, q stands for the main predicate symbol in Z, and x is a sequence of variables, by Here z is a sequence of fresh variables distinct from the variables in x.We can think of Plus as a generalization of Plus where we have abstracted from the concrete data type Nat.In fact, Plus is an instance of Plus: Similarly, instantiating the form Plus with the program List(u, x) for constructing the data type of lists (cf.Example 2) yields the program5 which is the program for appending lists.For example, we have a, b, c, d]).
As a further example, we want to define the 'addition' of (binary) trees by instantiating the form Plus with T ree.Note that we now have multiple choices: since T ree(u, x, y) contains two variables x and y occurring in the second rule's body and head, we have two possibilities: we can either choose x = y or x y.Let us first consider the program plus (void, y, y) plus(t(u, x, x), y, t(u, z, z)) ← plus(x, y, z).
This program 'appends' the tree in the second argument to each leaf of the symmetric tree in the first argument.Notice that all of the above programs are syntactically almost identical, e.g., we can transform Plus into Plus(List) via a simple rewriting of terms.The next program shows that we can derive programs from Plus which syntactically differ more substantially from the above programs.Concretely, the program is logically equivalent to program (8).However, in some situations this more complicated representation is beneficial.For example, we can now remove the second and third body atom to obtain the more compact program plus(void, y, y) plus(t(u, x 1 , x 2 ), y, t(u, z 1 , z 2 )) ← plus(x 1 , y, z 1 ), plus(x 2 , y, z 2 ).
This program, in analogy to program (8), 'appends' the tree in the second argument to each leaf of the not necessarily symmetric tree in the first argument and thus generalizes (8).Generally speaking, solutions to proportional equations my be inexact in nature needing further transformation in order to satisfy additional information and constraints.
Example 10.In Example 3, we have constructed the program Even, representing the even numbers, from Nat by inheriting its fact and by iterating its proper rule once.By replacing Nat in (4) by a program variable Z, we arrive at the form We can now instantiate this form with arbitrary programs to transfer the concept of "evenness" to other domains.For example, consider the program Reverse for reversing lists given by where .
By instantiating the form Even with Reverse, we obtain the program .
One can verify that this program reverses lists of even length.Similarly, if S ort is a program for sorting lists, then Even(S ort) is a program for sorting "even" lists and so on.
Example 11.The program for checking list membership is given by .
Notice the syntactic similarity between the program List of Example 2 and the second arguments in Member -in fact, we can deconcatenate Member as follows: .
The second factor can be expressed in terms of List via This yields the form Member(Z(u, x) q ), where x is a (possibly empty) sequence of variables, given by We can now ask -by analogy -what "membership" means in the numerical domain.For this, we compute One can easily check that this program computes the "less than" relation between numerals.

Logic program proportions
This is the main section of the paper.Recall from Example 1 that we want to formalize analogical reasoning and learning in the logic programming setting via directed analogical proportions between programs.For this, we instantiate here a fragment of Antić's (2022) abstract algebraic framework of analogical proportions within the algebra of logic programs from above using logic program forms.
Let us first recall Antić's (2022) framework, where we restrict ourselves to the directed fragment.In the rest of the paper, we may assume some "known" source domain P and some "unknown" target domain R, both algebras of logic programs over some languages L P and L R , respectively.We may think of the source domain P as our background knowledge -a repertoire of programs we are familiar with -whereas R stands for an unfamiliar domain which we want to explore via analogical transfer from P. For this we will consider directed analogical equations of the form 'P transforms into Q as R transforms into X' -in symbols, P → Q : • R → X -where P and Q are programs of P, R is a program of R, and X is a program variable.The task of learning logic programs by analogy is then to solve such equations and thus to expand our knowledge about the intimate relationships between (seemingly unrelated) programs, that is, solutions to directed analogical equations will be programs of R which are obtained from R in R as Q is obtained from P in P in a mathematically precise way (Definition 12).Specifically, we want to functionally relate programs via rewrite rules as follows.
Recall from Example 1 that transforming Nat into Plus means transforming Id(Nat) into Plus(Nat), where Id(Z) := Z and Plus(Z) are forms.We can state this transformation more pictorially as the rewrite rule Id → Plus.Now transforming the program List 'in the same way' means to transform Id(List) into Plus(List), which again is an instance of Id → Plus.Let us make this notation official.We will always write F( Z) → G( Z) or F → G instead of (F, G), for any pair of forms F and G containing program variables among Z such that every program variable in G occurs in F. We call such expressions justifications.We denote the set of all justifications with variables among Z by J( Z).We make the convention that → binds weaker than every other algebraic operation.
The above explanation motivates the following definition.Define the set of justifications of two programs P and R in P by For instance, Jus(Nat → Plus(Nat)) and Jus(List → Plus(List)) both contain the justification Z → Plus(Z).
We are now ready to state the main definition of the paper as an instance of (the directed fragment of) Antić's (2022, Definition 5).
Definition 12.A directed program equation in (P, R) is an expression of the form 'P transforms into Q as R transforms into X' -in symbols, where P and Q are source programs from P, R is a target program from R, and X is a program variable.
Given a target program S ∈ R, define the set of justifications of P → Q : • R → S in (P, R) by We say that J is a trivial set of justifications in (P, R) iff every justification in J justifies every directed proportion In this case, we call every justification in J a trivial justification in (P, R).Now we call S a solution to (10) in (P, R) iff either Jus P (P → Q) ∪ Jus R (R → S ) consists only of trivial justifications, in which case there is neither a non-trivial transformation of P into Q in P nor of R into S in R; or Jus (P,R) (P → Q : • R → S ) is maximal with respect to subset inclusion among the sets Jus (P,R) (P → Q : • R → S ′ ), S ′ ∈ R, containing at least one non-trivial justification, that is, for any program S ′ ∈ R, In this case, we say that P, Q, R, S are in directed logic program proportion in (P, R) written as We denote the set of all solutions to (10) in (P, R) by S ol Roughly, a program S in the target domain is a solution to a directed program equation of the form P → Q : • R → X iff there is no other target program S ′ whose transformation from R is more similar to the transformation of P into Q in the source domain expressed in terms of maximal sets of algebraic justifications.
We will always write P instead of (P, P).In what follows, we will usually omit trivial justifications from notation.So, for example, we will write only trivial justifications in (P, R), et cetera.The empty set is always a trivial set of justifications.Every justification is meant to be non-trivial unless stated otherwise.
The forms justify any proportion P → Q : • R → S , which shows that tr 1 → tr 2 is a trivial justification.This example shows that trivial justifications may contain useful information about the underlying structures -in this case, it encodes the trivial observation that any two programs P and Q are symmetrically We call a form F( Z) a P-generalization of a program P in P iff P = F( O), for some O ∈ P | Z| , and we denote the set of all P-generalizations of P in P by Gen P (P).Moreover, we define for any programs P ∈ P and R ∈ R:

Example 13. Consider the directed equation of Example 1 given by
This equation asks for a list program S which is obtained from List as the program Plus on numerals is obtained from Nat.In Example 22, we will see that the program for concatenating lists is a solution to (11).
Example 14.Consider the directed equation given by where Reverse is the program for reversing lists of Example 10.In Example 23, we will see that the program for reversing lists of even length is a solution to (12).
To guide the AI-practitioner, we shall now rewrite the above framework in a more algorithmic style (cf.Antić, 2022, Pseudocode 17).
Pseudocode 15.Computing the solution set S to a directed logic program equation P → Q : • R → X consists of the following steps: (1) Compute S 0 := S ol (P,R) (P and for each form G( Z) ∈ Gen P (Q) containing only variables occurring in F( Z) and satisfying (c) Identify those non-empty sets Jus (P,R) (P → Q : • R → S ) which are subset maximal with respect to S and add those S 's to S 0 .(2) For each S ∈ S 0 , check the following relations with the above procedure: Add those S ∈ S 0 to S which pass all three tests.The set S now contains all solutions to P : Q :: R : X in (P, R).

Properties of logic program proportions
We summarize here Antić's (2022) most important properties of analogical equations and proportions interpreted in the logic programming setting from above.6.1.Characteristic justifications.Computing all justifications of an analogical proportion is complicated in general, which fortunately can be omitted in many cases.
We call a set J of justifications a characteristic set of justifications (Antić, 2022, Definition 20) The following lemma is a useful characterization of characteristic justifications in terms of mild injectivity (cf.Antić, 2022, Uniqueness Lemma).
Lemma 16 (Uniqueness Lemma).For any justification Proof.See the proof of Antić's (2022, Uniqueness Lemma).6.2.Functional proportion theorem.In the rest of the paper, we will often use the following reasoning pattern which roughly says that functional dependencies are preserved across (different) domains (cf.Antić, 2022, Functional Proportion Theorem): Theorem 17 (Functional Proportion Theorem).For any (A ∩ B)-form G(Z), we have ), for all P ∈ P and R ∈ R.
In this case, we call G(R) a functional solution of P → Q : Proof.See the proof of Antić's (2022, Functional Proportion Theorem).
Functional solutions are plausible since transforming P into G(P) and R into G(R) is a direct implementation of "transforming P and R in the same way", and it is therefore surprising that functional solutions can be nonetheless 'unexpected' and therefore 'creative' as will be demonstrated in Section 7.
Remark 18.An interesting consequence of Theorem 17 is that in case Q ∈ P ∩ R is a constant program contained in both domains P and R, we have characteristically justified by Theorem 17 via Z → Q.This can be intuitively interpreted as follows: every program in P ∩ R has a 'name' and can therefore be used to form logic program forms, which means that it is in a sense a "known" program.As the framework is designed to compute "novel" or "unknown" programs in the target domain via analogy-making, (13) means that "known" target programs can always be computed.
The following result summarizes some useful consequences of Theorem 17.
Corollary 19.For any source program P ∈ P, target program R ∈ R, and joint programs Q, S ∈ P ∩ R, the following proportions hold in (P, R): The following result is an instance of Antić (2022, Theorem 28).
Corollary 20.For any logic programs P, Q ∈ P and R ∈ R, we have

Examples
In this section, we demonstrate the idea of learning logic programs by analogy via directed logic program proportions by giving some illustrative examples.
Example 21.Let A = {a, b} and B = {c, d} be propositional alphabets, and let P and R for the moment be the identical spaces of all propositional programs over A ∪ B. Consider the following directed equation: Here we have at least two candidates for the solution S .
First, we can say that the second program in ( 16) is obtained from the first by adding the fact b, in which case we expect -by analogy -that is a solution to (16).Define the form Then the computations show that S is indeed a solution by Theorem 17, that is, we have However, what if we separate the two domains by saying that P and R are the spaces of propositional programs over the disjoint alphabets A and B, respectively?In this case, Z → G(Z) is no longer a valid justification of (20) as {b} in (18) is not contained in P ∩ R.This makes sense since, in this case, the "solution" S contains the fact b alien to the target domain R. Thus the question is whether we can redefine G, without using the fact b, so that (19) holds.Observe that b is also the body of a ← b, which motivates the following definition: which means that we can compute a solution S ′ of ( 16) via Theorem 17 as Example 22. Reconsider the situation in Example 9, where we have derived the abstract form Plus generalizing addition.As a consequence of Theorem 17, we have the following directed logic program proportion: 6 Nat → Plus(Nat) : • List → Plus(List).(21) 6 For simplicity, we omit here the variables u and x from notation, that is, we write Nat and List instead of Nat(x) and List (u, x), respectively.This proportion formalizes the intuition that "numbers are to addition what lists are to list concatenation."Similarly, we have Nat → Plus(Nat) : • T ree → Plus(T ree).
Without going into technical details, we want to mention that a similar procedure as in Example 9 applied to a program for multiplication yields a form Times(Z q ( x)) such that Times(List(u, x)) is a program for "multiplying" lists, e.g., where the result [b, b, b, b] is obtained from the input lists by concatenating the second list [b, b] k times with itself, where k is the length of the first list (in this case k = 2; the actual content of the first list does not matter here).We then have the following directed logic program proportion as an instance of Theorem 17: In other words, addition is to multiplication what list concatenation is to list "multiplication." Example 23.In Example 3, we have constructed Even from Nat via composition and in Example 10, we have then derived the abstract form Even generalizing "evenness."As a consequence of Theorem 17, we have the following directed logic program proportion: This shows that the (seemingly unrelated) program for reversing lists of even length shares the syntactic property of "evenness" with the program for constructing the even numbers.
Example 24.In Example 11, we have derived the abstract form Member generalizing "membership" and we have asked the following question: What does "membership" mean in the numerical domain?
We can now state this question formally in the form of the following directed logic program equation: As a consequence of Theorem 17, we have the following directed logic program proportion: where Member(Nat(u)) is the program computing the numerical "less than" relation of Example 11.

Related Work
Arguably, the most prominent (symbolic) model of analogical reasoning to date is Gentner's (1983) Structure-Mapping Theory (or SMT), first implemented by Falkenhainer, Forbus, and Gentner (1989).Our approach shares with Gentner's SMT its symbolic nature.However, while in SMT mappings are constructed with respect to meta-logical considerations -for instance, Gentner's systematicity principle prefers connected knowledge over independent facts -in our framework 'mappings' are realized via directed logic program proportions satisfying mathematically well-defined properties.
The functional-based view in Barbot, Miclet, and Prade (2019) is related to our Theorem 17 on the preservation of functional dependencies across different domains (Section 6.2).Moreover, Antić (2022, §7.3) contains a brief discussion on the important difference between analogical proportions and categories as studied in abstract algebra.
Heuristic-Driven Theory Projection (HDTP) (Schmidt, Krumnack, Gust, & Kühnberger, 2014) has a similar focus on analogical proportions between logical theories.The critical difference to our approach is that in our framework, we consider the set of all generalizations of a program, whereas in HDTP only minimally general generalizations (mggs) are considered, that is, there is no notion of "justification" in HDTP and proportions are "'justified" by mggs only.Another difference is that HDTP is formulated within first-order logic, whereas our framework is formulated within logic programming.The main benefit of restricting the formalism to logic programs (i.e., sets of Horn clauses) is that the rule-like syntactic form of logic programs allows an algebraization via composition and concatenation -this is not the case for first-order logic.Moreover, the task of finding generalizations is governed by heuristics in HDTP, which has no counterpart in our theory.In a sense, similar to HDTP, our framework can be interpreted as a generalization of classical anti-unification (Plotkin, 1970;Reynolds, 1970).More precisely, while anti-unification focuses on least general generalizations of terms, we are interested here in all generalizations of programs (i.e.logical theories).
Finally, we want to mention the recent work in Antić (2023b) where a syntactic and algebraic notion of logic program similarity has been introduced via sequential compositions and decompositions as defined in Section 3.1.

Future Work
In this paper, we have demonstrated the utility of our framework of directed logic program proportions for learning logic programs by analogy with numerous examples.
The main task for future research is to develop methods for the algorithmic computation of solutions to directed program equations as defined in this paper (see Pseudocode 15).At its core, this requires algebraic methods for logic program decomposition (Antić, 2023d(Antić, , 2023c) and deconcatenation, which are then used to compute forms generalizing a given program and (characteristic) justifications of a directed proportion.This task turns out to be highly non-trivial even for the propositional case.In fact, the only domains I fully understand at the moment is the 2-valued boolean domain consisting only of two elements 0 and 1 -and already in that simple case a whole paper is needed to fully describe all solutions (Antić, 2023a)!For example, in the arithmetical domain of natural numbers with multiplication, computing all solutions even to a single concrete analogical equation is non-trivial: computing all solutions to 20 : 4 :: 30 : x requires an 8-page long computation (cf.Antić, 2022, pp.42, Example 66).Since logic programs are more complicated than booleans or numbers, providing general algorithms for the computation of some or all solutions to analogical logic program equations is highly non-trivial even in the propositional case and far beyond the scope of the current paper.
This does not mean that the framework is useless for learning -to the contrary, the paper shows, I hope, quite convincingly that learning of logic programs via solving analogical equations (which appears to be a novel idea) can in principle be done via solving (directed) logic program equations as proposed in the paper.It is therefore, in a sense, a "declarative" paper which shows what can be done with logic program proportions -in the future, more "procedural" papers will be needed to resolve the issue of how solutions to equations are to be computed in practice.
Composition and concatenation are interesting operations on programs in their own right and a comparison to other operators for program modularity (cf.Bugliesi, Lamma, & Mello, 1994;Brogi, Mancarella, Pedreschi, & Turini, 1999) remains as future work.A related question is whether these operations are sufficient for modeling all plausible analogies in logic programming or whether further operations are needed ("completeness").It is important to emphasize that in the latter case, adding novel operations to the framework does not affect the general formulations of the core definitions.
In this paper, we have restricted ourselves to Horn programs.In the future we plan to adapt our framework to extended classes of programs as, for example, higher-order (cf.Chen, Kifer, & Warren, 1993;Miller & Nadathur, 2012) and non-monotonic logic programming under the stable model or answer set semantics (Gelfond & Lifschitz, 1991) and extensions thereof (cf.Brewka, Eiter, & Truszczynski, 2011).For this, we will define the composition and concatenation of answer set programs (Antić, 2023c) which is non-trivial due to negation as failure occurring in rule bodies (and heads).
Finally, a formal comparison of analogical reasoning and learning as defined in this paper with other forms of reasoning and learning, most importantly inductive logic programming (Muggleton, 1991), is desirable as this line of research may lead to an interesting combination of different learning methods.

Conclusion
This paper studied directed analogical proportions between logic programs for logic-based analogical reasoning and learning in the setting of logic programming.This enabled us to compare logic programs possibly across different domains in a uniform way which is crucial for AI-systems.For this, we defined the composition and concatenation of logic programs and showed, by giving some examples, that syntactically similar programs have similar decompositions.This observation led us to the notion of logic program forms which are proper generalizations of logic programs.We then used forms to formalize directed analogical proportions between logic programs -as an instance of the author's model of analogical proportions -as a mechanism for deriving novel programs in an "unknown" target domain via analogical transfer -realized by generalization and instantiationfrom a "known" source domain.