Recurrence-Driven Summations in Automated Deduction

. Many problems in mathematics and computer science involve summations. We present a procedure that automatically proves equations involving finite summations, inspired by the theory of holo-nomic sequences. The procedure is designed to be interleaved with the activities of a higher-order automatic theorem prover. It performs an induction and automatically solves the induction step, leaving the base cases to the theorem prover.


Introduction
Finite summations-that is, summations n i=m t i over finitely many terms t iare ubiquitous in mathematics and computer science, but they are poorly supported by automatic theorem provers.One reason is that summations are higherorder, whereas most theorem provers are first-order.
In recent years, we have seen the rise of higher-order provers [2,3,[16][17][18].With these provers, n i=m t i can be represented as sum m n (λi.t i ); the traditional syntax can be seen as syntactic sugar.But despite the use of heuristics [17,Sect. 4], higher-order provers are ill-equipped to reason inductively.A simple problem such as n i=0 i = n(n + 1)/2 is a formidable challenge for them, even if we include axioms for +, •, /, and together with an induction principle.In this paper, we introduce a procedure for proving such equations in a higher-order prover.The procedure is triggered by a proof goal of the form k s + t = u, possibly with some conditions (Sect.2).In a refutational prover, the equation would be negated, as k s + t ̸ = u, and would correspond to the negated conjecture, a problem axiom, or some clause derived by the prover.
Our procedure translates facts about summations to linear recurrences.These recurrences have almost the same form as multivariate holonomic sequences [20], which, while not being a prerequisite for reading this paper, strongly inspired our work.Each recurrence is associated with a multivariate sequence-a sequence with one or more indices.In this paper, the word "sequence" generally means "multivariate sequence." The procedure has three steps.
1. Initialization (Sect.3): Heuristically choose terms in the goal to generalize and perform induction on.Among the problem axioms, select those of a suitable form as initial recurrences for the procedure.2. Propagation (Sect.4): From the initial recurrences, compute recurrences corresponding to the goal.For +, •, and expressions occurring in the goal, recurrences are computed from the recurrences of their direct subexpressions.3. Induction (Sect.5): If the final recurrences for the goal involve only the goal and no other sequences, use them for induction.If they make the difference of successive values of k s+t−u constantly 0, this establishes the induction step.Then reduce the goal to a set of base cases and give these to the prover.
Propagation and induction apply holonomic-style techniques almost as a black box.Initialization connects them to the overall proof search.For example, to prove n i=0 i = n(n + 1)/2, the procedure would transform the equation into recurrences and find out that the difference n i=0 i − n(n + 1)/2 remains constant as n increases, thereby establishing the induction step.If that difference is constantly 0, we get n i=0 i = n(n + 1)/2; in general, it suffices to prove a number of base cases, which are left to the prover.This example is very simple, but the procedure scales up to more sophisticated problems (Sect.6).An implementation is under way in the Zipperposition prover [17].
The procedure treats as an interpreted (built-in) symbol.The summation expression evaluates to a value in a commutative group, or a ring if ring multiplication is present.The commutative group or ring gives us +, •, and −.These are also interpreted, as are numerals.Integers, including indices, can multiply group elements.Based on the interpretation, we use the forms t = u and t − u = 0 interchangeably.
Compared with Wilf-Zeilberger pairs [19] and other methods (Sect.7), the main benefit of our procedure is that it goes beyond holonomic sequences and supports both uninterpreted functions and an infinite number of base cases.Our procedure is widely applicable and may help prove not only difficult summations in a restrictive form but also easier summations in a more general form, which is useful in a general-purpose theorem prover.At the heart of our work is the novel combination of techniques from superposition and holonomic sequences, which is visible both in the prover integration (Sect.2) and in the computation of so-called excess terms (Sect.4).We refer to our technical report [14] for more details.
These side conditions apply: • t[⃗ s ] is an expression that can be brought into the general form k n i=m t ′ +t ′′ ; • the procedure selects, generalizes, and performs an induction on the subterms ⃗ s of t (Sect.3); • the procedure succeeds at proving the induction step based on initial recurrences derived from C 1 , . . ., C l (Sect.3) and their propagation (Sect.4); • the procedure identifies B as the finite set of base cases of the induction, where each case is a vector ⃗ b of terms of the same length as ⃗ s (Sect.5); and • the subclause D captures potential conditions determined by the procedure.
The intuition behind the rule is that the conclusion should be easier to refute than the rightmost premise.As for the premises C 1 , . . ., C l , they can contain useful information about ⃗ s, often about bounds.

Initialization
The first step of our procedure is to recognize the structure of recurrences.Variables on which we can perform induction appear as Skolem constants in the negated goal.Further opportunities for induction can be created by generalizing complex terms.Also as part of this step, we must choose which terms represent (multivariate) sequences and which clauses represent their recurrences.
Theory Detection.We require the necessary theory of summation to be predefined.Specifically, this refers to the inductive theory of integers, axioms for commutative groups (including multiplication by integers), and the definition of summation from 0 by Ring multiplication may be absent, so we do not take it as predefined.Instead, we search candidate binary operators from the negated goal.For each candidate, we can try to prove left and right distributivity by syntactically looking for that axiom or by running another instance of the prover.Distributivity is the only necessary property to apply the procedure, but associativity, commutativity, and the unit element can also be used in simplifications.
Term Generalization.Term generalization transforms Skolem constants or complex terms into variables and then performs an induction on the variables.We propose a straightforward heuristic: For each nonnumeral subterm s of type Z occurring in the negated goal, generalize s if s stays variable-free even after recursively applying this heuristic on the proper subterms of s itself.For example, in the following variable-free integer terms, the underlined subterms would be generalized: a, 123, f 0 2, 2f (g (−1)) (−3a), f 1 (g (a + 1)), f (g a) 7a.
Let ⃗ s = (s 1 , . . ., s d ) be the subterms chosen for generalization.Then, based on the negated goal C ′ ∨ t[⃗ s ] ̸ = 0 (as in the Summation rule), generalization sets up the goal ∀⃗ n ∈ N. t[⃗ n] = 0 where N ⊆ Z d collects the bounds of ⃗ s (often N = N d ).We try to prove this goal up to base cases and other mild conditions.
The generalization makes it possible to use induction to prove that the goal sequence term t[⃗ n]-a function of ⃗ n-equals zero on N .We try to prove the generalized goal assuming ¬C ′ and some extra conditions E such as the base cases of the induction.Then, instantiating ⃗ n := ⃗ s, we conclude C ′ ∨ ¬E ∨ t[⃗ s ] = 0. This, together with the negated goal C ′ ∨ t[⃗ s ] ̸ = 0, implies a conclusion of the form C ′ ∨ ¬E for the Summation rule.Note that C ′ is not generalized.
The set N embodies knowledge about ⃗ s that we find among existing clauses C 1 , . . ., C l and the condition ¬C ′ .The free variables of ¬C ′ are interpreted as constants, and they can also occur in ⃗ s.For example, assume that ⃗ s = f s ′ and ⃗ n = n and that the generalized goal contains the factorial n!.Its recurrence must be in a conditional clause-e.g., (m + 1)! = (m + 1)m! ∨ 0 ̸ ≤ m.To use this recurrence for n!, we need n ≥ 0, which we can ensure using N if we find a bounding clause f s ′ ≥ 0 or its generalization such as f m ≥ 0 where m is a free variable.The more we know about ⃗ s, the more recurrences we can get.At the same time, N must allow induction, so we keep it convex by considering only coordinatewise bounds of ⃗ s.
Form of sequence terms.Sequence terms are terms of the underlying higherorder logic that our procedure can work with.From their structure, we distinguish (pointwise) addition and multiplication, summation, and affine substitution.This gives a first-order grammar to express the sequence terms.
Definition 1. Sequence terms on a ring A are inductively defined as follows.The logic's terms of type A with distinguished integer variables ⃗ n are sequence terms.If f ⃗ n and g ⃗ n are sequence terms with d variables ⃗ n, then so are f {nj →i}⃗ n , and σf ⃗ n = f σ⃗ n where ⃗ c is a vector, a is an integer, and σ is an affine substitution (meaning σ ⃗ m = q ⃗ m + ⃗ b for a matrix q and a vector ⃗ b); a, the entries of ⃗ c, and the entries of σ (meaning the entries of q and ⃗ b) must be numerals.
Remark 2. In Definition 1 and in the sequel, a commutative group can be used instead of a ring if ring multiplication is absent.In this case, all formulas involving ring multiplication (e.g., f ⃗ n • g ⃗ n ) should be ignored.
We view sequence terms as functions Z d → A. We then write the sequence terms from the definition compactly as f + g, f • g, a j f , and σf , and call a = ⃗ c • ⃗ n + d an affine variable sum.Moreover, since •, a j , and σ all distribute over +, we can write any sequence term as c 1 f 1 +• • •+c k f k where the coefficients c j are numerals and the sequence terms f j are distinct and do not contain +.Finally, we forbid variable shadowing: a nj =0 binds n j , and while a j b j g and nj j g and other references to n j outside a j are syntactically valid, we avoid such forms by renaming them during encoding and never reintroducing them.
Choice of Initial Recurrences.Semantically, the recurrences we look for are multivariate heterogeneous linear finite-fixed-step equations with polynomial coefficients.An archetypical example is Here, the sequences f, h, 1 are bivariate, and the sequence indices are all of the form n + k or m + k for numerals k ∈ Z, amounting to finite fixed steps.
The general form is 0 = P 1 g 1 + • • • + P k g k = ⃗ P • ⃗ g where ⃗ g = (g 1 , ..., g k ) is a tuple of sequence terms and ⃗ P is a tuple of operator polynomials as defined below.If k = 1, we have a homogeneous recurrence of g 1 ; otherwise, it is heterogeneous.Definition 3. Operator polynomials are a Z-algebra with composition as product (meaning closed under addition, composition, and integer multiplication) spanned by the multiplier and shift operators: • The multiplier operator M j of index j multiplies a multivariate sequence f by the variable n j of index j: With d index variables, the operator polynomials look like ordinary polynomials Z[M 1 , ..., M d , S 1 , ..., S d ], but the composition product is noncommutative since S i M i = M i S i + S i for all i = 1, . . ., d (a derivation of which is given in the next section directly above equation ( 2)).As an example of expressing recurrences in terms of operator polynomials, consider the previous archetypical recurrence (1).Taking n as the first and m as the second variable, the recurrence reads Remark 4. The expression ⃗ P • ⃗ g identifying a recurrence is itself a sequence term.It suffices to observe that if f is a sequence term, then so are the substitution S j f and the product M j f = (⃗ n → n j ) • f with the projection sequence term ⃗ n → n j .
As sketched in Sect. 1, we must select some of the problem axioms as initial recurrences for the procedure.This is accomplished as follows.Let there be an edge between two axioms of the form C ∨ s = t (where C may be empty) if they both contain a top-level occurrence of the same sequence g, i.e., an occurrence of g that is not nested inside an uninterpreted function symbol.The axioms then form a graph.We take as initial recurrences the connected component of the generalized goal.
By a sequence g, we mean the f ⃗ a part of a term of the form f ⃗ a ⃗ n where f is an uninterpreted function symbol, ⃗ a is a tuple of variable-free terms, and ⃗ n is a nonempty tuple of integer variables or affine (i.e., linear term + constant term) combinations of them.The tuples ⃗ a and ⃗ n may in general be interleaved.
In other contexts, an analogous step is known as lemma filtering or premise selection [4,Sect. 2].Clutter from irrelevant facts is less of an issue in the context of our procedure because it can use only linear recurrences.Beyond this, our simple heuristic does nothing to avoid clutter.
What should we do about conditions such as C in C ∨ f ⃗ a ⃗ n = t?We could forbid them and work only with unit equations such as f ⃗ a ⃗ n = t.We could collect them and put them in the D component of the Summation rule's conclusion.Or we could attempt to prove them when the initial recurrences are selected.In our ongoing implementation, we chose the first option, but what the best option is remains an open question.

Propagation
Holonomic sequences can be defined by homogeneous recurrences with polynomial coefficients and finitely many base cases.They are closed under the four operations that build sequence terms (+, •, a j , σ), which especially makes their equality decidable [20].The closure is realized by four procedures to derive recurrences of a sequence term from the recurrences of its immediate subterms, which we call propagation.We can propagate independently of the base cases and hence work on nonholonomic sequence terms [6].Although we expect the holonomic subcase to be decidable in our setting, in general decidable equality is lost.Additionally, unlike in the holonomic setting, we allow heterogeneous recurrences.We will build this into our noncommutative Gröbner basis setup that is used in the propagation procedures.
Gröbner Bases of Recurrence Operators.A (generalized) Gröbner basis is a certain well-behaved generating set of a left-ideal of (possibly noncommutative) polynomials.Equivalently, we will view it as a system of polynomial equations that is complete for rewriting.Given a polynomial equation P = 0, for every monomial M we get a rewrite rule as follows.Decompose M P as M P = L + R where L is the leading monomial of M P w.r.t. a fixed monomial ordering times its coefficient.Then L = −R gives rise to a rewrite rule L → −R.A system of equations is complete for rewriting if every one of its consequences can be proved via rewriting by these rules.
Example 5.The system {ab 2 = a + b, a 2 b = a + 1} does not prove its consequence a 2 = b by rewriting.(We can see that a 2 = b is a consequence by multiplying the first equation by a and the second equation by b and then by subtracting the two equations.)In the other direction, the system's Gröbner basis A theory of Gröbner bases exists for various polynomial algebras [10].In our setting, a sufficient requirement is that all indeterminates X, Y commute up to lower-order terms: XY − Y X ∈ ZX + ZY + Z.The operator polynomials of Definition 3 fall into this category with the natural choice of taking all multiplier and shift operators as indeterminates.Indeed, for any sequence term f , we have the noncommutation relations and all other pairs of multipliers and shifts commute exactly.That is: for all i and j, where δ i,j equals 1 if i = j and 0 otherwise.When we consider a formal polynomial algebra (necessary to perform Gröbner basis computations), we will usually mean polynomials with integer coefficients and indeterminates M 1 , M 2 , . . ., S 1 , S 2 , . . .satisfying (2).Exceptionally, when we use propagation to substitution, we will consider compositions of shifts formally as further individual indeterminates, as explained above Procedure 12. Apart from this exceptional setting, we fix a choice of monomials as follows.
Definition 6.In our setting, a monomial is a polynomial of the form where the exponents x j , y j ∈ N are numerals.Due to the (non)commutation relations (2), polynomials can be written as sums of monomials times their integer coefficients.This makes working with these noncommutative polynomials similar to working with commutative ones.A major difference is that monomials are not closed under product, as illustrated by This complicates the definition of monomial order below, which in turn defines how to interpret a polynomial equation as a rewrite rule.Definition 7. A monomial order ≼ is a well-founded total order on monomials such that for all monomials A, B, C, if A ≼ B, then the leading monomial of CA is ≼-smaller than the leading monomial of CB; here, the leading monomial of a nonzero polynomial P means the ≼-largest monomial occurring in P .
Buchberger's algorithm to compute Gröbner bases (also in a noncommutative context) is similar to saturation-based theorem proving.It repeatedly derives from polynomial equations P = 0 and R = 0 new equations AP − BR = 0 where coefficient-monomial products A, B make the leading monomials of AP and BR cancel.It suffices to take A, B with smallest total degree and coprime coefficient.A and B play a similar role to the most general unifier in superposition.Since S j is semantically bijective, we can and always do cancel it, replacing S j R = 0 by R = 0.This modified completion into a Gröbner basis always terminates.The standard termination proof reduces to applying noetherianity of commutative polynomials over Z or Dickson's lemma [10].
A single operator polynomial P 1 perfectly encodes a linear homogeneous recurrence 0 = P 1 g of a sequence term g.However, we allow any heterogeneous recurrence of the form 0 = ⃗ P ) is an arbitrary tuple of different sequence terms.We can encode this by a single operator polynomial for the duration of one Gröbner basis computation as follows.Let ⃗ f enumerate exactly once all the sequence terms needed to express the current recurrences with the help of operator polynomials.Let ⃗ f depend on d variables.For each f j , we consider a shift F j := S d+j w.r.t. a so far unused variable.Then the operator polynomial ⃗ P • ⃗ F encodes 0 = ⃗ P • ⃗ f .This encoding does not respect the semantics of operator polynomials; to recover it, we must apply the substitution { ⃗ F → ⃗ f }.However, products such as F 1 F 2 remain uninterpretable even with ring-valued sequences because the operator product-function composition-is different from multiplication of f j 's.Hence, we will simply discard uninterpretable polynomials after the Gröbner basis computation.Moreover, from now on, we will freely write f j for F j .Definition 8. Let X 1 , . . ., X n be an enumeration of all multiplier and shift indeterminates.An (X 1 , . . ., X k )-elimination order is a monomial order such that X j ≻ X a k+1 k+1 • • •X an n for all indices j ≤ k and all exponents a k+1 , . . ., a n ∈ N.
Our default choice for the order is to compare total degree in X 1 , ..., X k and break ties using the total degree reverse lexicographical order [7, Chapter 2 §2].Procedure 9. Eliminating indeterminates X 1 , . . ., X k from a finite system of equations E means computing a Gröbner basis G of E w.r.t. an (X 1 , . . ., X k )elimination order and then discarding all polynomials from G that contain any of X 1 , . . ., X k or that are not linear in the indeterminates encoding sequence terms.(As mentioned above, during the Gröbner basis computation, whenever we derive a polynomial S j R, we replace it by R.) While in principle any Gröbner basis would suffice for elimination, our default choice is to compute the reduced Gröbner basis (i.e., the fully simplified one).The nonlinear polynomials can be discarded as soon as they are derived during the Gröbner basis computation instead of only at the end.Recurrence equations produced by elimination are logical consequences of the input equations, as we explain in our technical report.
Despite the formally equivalent roles of all sequence terms f i in the recurrence 0 = ⃗ P • ⃗ f , we associate with every recurrence a sequence term f j .It is often convenient to write such a recurrence of f j as P j f j +e = 0 where the excess terms e = ⃗ P • ⃗ f −P j f j contain all sequence terms f i except f j .The choice of f j among ⃗ f will be determined by the definition of excess terms (Definition 18).However, this choice remains irrelevant for the individual propagation steps, described below.We adapt these steps from the four closure properties of holonomic sequences by carrying excess terms along.
Propagation to Addition.Let us start with addition of sequence terms.
Procedure 10.Let f and g be sequence terms, and let h be the formal name of their addition f +g.The associated recurrences F of f and G of g are propagated to those of h by eliminating f and g from F ∪ G ∪ {h = f + g}.(By Procedure 9, this involves computing a Gröbner basis for these equations and then discarding the equations containing f or g as well as the corresponding nonlinear terms.)Actually, the same propagation technique works if f + g is replaced by any expression in the general recurrence format ⃗ P • ⃗ l (a dot product of operator polynomials ⃗ P and sequence terms ⃗ l).The key is that the defining equation h = ⃗ P • ⃗ l is again a linear recurrence.Such propagations could also be done by iterating more primitive propagations.
Example 11.Consider the goal n j=0 a j = g n + a 0 given g 0 = 0 and g n+2 = g n + a n+1 + a n+2 for all n ∈ N. The defining recurrence of g can be written using the operator polynomials as S 2 1 g = g + S 1 a + S 2 1 a.The defining recurrence of the sum f n := n j=0 a j is S 1 f = f + S 1 a.We must prove that h n := g n + a 0 − f n is 0. To achieve this, we propagate recurrences to h using the elimination procedure described above (Procedure 9) and the total-degree-based (f, g)-elimination order with f ≺ g.Leading monomials are shown in bold: In this example, h n+2 −h n = 0 is the only recurrence that does not contain f and g, so we discard the rest of the Gröbner basis calculation.Since h n+2 − h n = 0 contains only the sequence h, we can use it to prove the induction step (of size 2) of a proof of ∀n.h n = 0. We are then left with the two base cases h 0 = 0 and h 1 = 0, which the Summation inference would include in its conclusion without auxiliary symbols (f and h) as Clearly, any recurrence P f = 0 of f implies σP f = 0.Moreover, if σP = P ′ σ, then P ′ σf = 0 gives a recurrence of σf .Finding such a P ′ for a general P can be reduced to finding an operator polynomial P ′ X satisfying σX = P ′ X σ for every indeterminant X.This amounts to pushing all indeterminates X leftwards.For multipliers, we have σ(M 1 , . . ., M d ) = (a(M 1 , . . ., M D ) + ⃗ b)σ.In contrast, shifts are easily pushed only rightwards-namely, S j σ = σS This makes the S j 's suitable as indeterminates in Gröbner basis computations.Accordingly, for propagation to substitution, we enlarge our formal polynomial algebra to also contain the indeterminates S 1 , S 2 , . . .satisfying the relations (3), while also keeping (2).We note that, as operators, the indeterminates further satisfy (essentially by definition) the relations operator polynomial on the variables of f and the excess terms e do not contain f .Then P (f g) + eg = 0 because g is effectively a constant to P , and similarly for recurrences of g.With the help of this special case, propagation to product can be reduced to propagation to substitution, as explained below.
Procedure 14.Let f and g be sequence terms parameterized by the variables be a tuple of fresh variables.The recurrences of f and g are propagated to their pointwise product f g in two steps.First, the recurrences of the variable-disjoint product f ⃗ n g ⃗ m are the union of the recurrences of f ⃗ n multiplied on the right by g ⃗ m and of those of g ⃗ m multiplied on the left by are found by propagating to substitution using Procedure 12.
Propagation to Summation.We finally consider the summations n2 n1=0 f ⃗ n .We can assume that the variables are numbered so that the sum acts on the first two.Similarly to above, we consider the consequence n2 n1=0 P f ⃗ n + n2 n1=0 e ⃗ n = 0 of a recurrence P f + e = 0 of the sequence term f where P is an operator polynomial and e are excess terms.We want to find an operator polynomial P ′ such that n2 n1=0 P becomes P ′ n2 n1=0 up to excess terms.Like for substitutions, finding such a P ′ for P can be reduced to finding an operator polynomial P ′ X satisfying n2 n1=0 X = P ′ X n2 n1=0 up to excess terms for every indeterminant X.The result will be a recurrence Recurrences of a sequence term f are propagated to its sum n2 n1=0 f ⃗ n as follows.First, eliminate multipliers M 1 from all recurrences of f .Every resulting recurrence P f +e = 0 implies n2 n1=0 P f ⃗ n + n2 n1=0 e ⃗ n = 0. Here, P is an operator polynomial that does not contain M 1 , and the excess terms e do not contain f .Next, each of these recurrences is rewritten into the form P ′ n2 n1=0 f ⃗ n + E 0 + E n2 + n2 n1=0 e ⃗ n = 0 where P ′ is an operator polynomial and the E m 's are part of excess terms built by applying some operator polynomials and the substitution {n 1 → m} to f .This is achieved by commuting n2 n1=0 with indeterminates other than S 1 and S 2 .These two indeterminates are instead handled by 13.There we found for the summand f n1,n2 = n1 n2−n1 a recurrence (S 1 S 2 2 − S 2 −1)f = 0.It is actually the only recurrence after eliminating M 1 as a first step of propagation to summation.Next, we set S 1 to 1 using a telescoping identity: Then we push the remaining shifts S 2 leftwards: Hence, in total we have n2 n1=0 f = 0. Now this is the same recurrence that F n2+1 satisfies and hence the final propagation to difference gives (S 2 2 − S 2 − 1)( This proves an induction step of size 2 and leaves two base cases that can be discharged by a theorem prover.
Iteration on Excess Terms.Let g be the term from the negated goal to be proved to be 0.After propagating along the structure of g, we end up with recurrences of the form P g = e where P is an operator polynomial and the excess terms e do not contain g.In the holonomic case, e will be syntactically 0. We have also observed that e is often 0 in the nonholonomic case as well.But if e is not syntactically 0, then P g = e cannot immediately be used for a proof by induction.A solution is to iterate a full series of propagations with e in place of g to find P 2 e = e 2 and conclude P 2 P g = P 2 e = e 2 , then repeat as long as necessary.This process will always terminate, although it might fail to find recurrences.
We will impose an order on the sequence terms to accomplish three things.First, we get a proper definition of which terms in a recurrence are excess.Second, well-foundedness of the order will guarantee termination of the iteration of full propagations to excess terms.Third, the iterations can be interleaved with basic normalizations such as Definition 17.The spine of a sequence term f without addition, denoted by spine f , is the sequence term obtained intuitively by erasing operator polynomials from f .Precisely, this means fully reducing f by the rewrite rules at → t, M j t → t, {⃗ n → b⃗ n + ⃗ c}t → {⃗ n → b⃗ n}t, and Shift indeterminates mix with other substitutions, which explains the last two rules.For example, spine with addition, it contains multiple spines, one for each g j .The significance of spines is that when we derive a more complex consequence from a recurrence (during elimination by applying an operator polynomial to it), its spines do not become more complex.
We can easily describe how each propagation step changes the spines of the involved sequence terms.Propagation to the addition f + g produces only spines e f and e g in the resulting recurrences, where e f denotes a spine of a term from a recurrence of f and analogously for e g .Moreover, propagation to the substitution σf produces σe f , propagation to the product f g produces (spine f )e g and e f (spine g), and propagation to the summation n2 n1=0 f produces {n 1 → 0} (spine f ), {n 1 → n 2 }(spine f ), and n2 n1=0 e f , where e f and e g are as above.
We want propagations to preserve the invariant that excess terms are small.Given how spines change under propagation, a term order on spines offers a way to define smallness.We choose an order that also orients simplifications.
Definition 18. Fix a Knuth-Bendix order with argument coefficients [12] with exactly three weights W a n=0 > W (•) > 3W σ > 0 and all argument coefficients set to 2.Moreover, projection sequence terms corresponding to M j 's (Remark 4) must have equal weights, and substitutions with fewer bindings must have lower precedence.The excess (partial ) order on addition-free sequence terms is obtained by comparing the spines of terms using this fixed order.Excess terms of a recurrence are all its nonmaximal sequence terms w.r.t. the excess order.
The weights for the excess order are arranged to be compatible with normalization, which pushes substitutions to the leaf nodes of the term tree and pulls summations towards the root.The resulting normal form is simply the typical way of writing terms without explicit substitutions.It is also the normal form of the rewrite system consisting of the applicable associativity and/or commutativity rules of • as well as the following rules: where s, t, u are sequence terms, u does not contain the variable n j , a is an affine variable sum, M j = ⃗ n → n j is a projection sequence term, σ, σ ′ are affine substitutions, and the numeral c is nonnegative.These rules produce additions, which must be interpreted as follows.For any rule above of the general form t 0 → c 1 t 1 + • • • + c k t k , the actual rewrite on the level of entire recurrences is 0 where c j are numerals, the sequence terms f [t j ] are equal except for the distinguished subterm t j , and R is the sum of the remaining terms in the recurrence.
To conclude termination, it suffices to prove that t 0 dominates each of t 1 , . . ., t k individually.The proof is in our technical report.It makes apparent our choices of weights and argument coefficients for the transfinite Knuth-Bendix order.

Induction
After propagation, we consider all recurrences P g = 0 of the goal sequence term g to be proved to be 0. In exceptionally fortunate cases, the operator polynomial P is ±1 and we are unconditionally done because, for any group, the multiplication-by-±1 map is invertible.This happens when the objective is to prove a recurrence that this method derives as a substep anyway.Otherwise, we apply induction and leave as conditions the base cases as well as invertibility of the multiplication maps associated with the leading monomials' coefficients.
A common case is that variables range over natural numbers and we have a final recurrence with leading shift S b1 If there is more than one applicable final recurrence, we take the intersection of their base value sets w.r.t. the same monomial order.To see that it works, consider any point outside the intersection.It is a nonbase point w.r.t.some final recurrence and hence the induction step can be taken by the recurrence.
To represent the intersection as substitutions, we distribute it over the hyperplane stack unions.This results in a union of hyperline stacks of the form N (J, ⃗ b) := {⃗ n ∈ N d | n j < b j for all j ∈ J} where J ⊆ {1, . . ., d} and ⃗ b vary.One such stack is represented by j∈J b j substitutions {n j → a j | j ∈ J} where the a j 's are chosen arbitrarily such that 0 ≤ a j < b j .Unfortunately, distribution duplicates some base cases.To compensate, if I ⊆ J and ⃗ b ≥ ⃗ c pointwise, then N (I, ⃗ b) ⊇ N (J, ⃗ c), so that N (J, ⃗ c) can be removed in favor of N (I, ⃗ b).
If a variable n ∈ Z is unbounded, we perform two inductions on the rays: 0

Examples
Our procedure can prove the induction step of holonomic sequence formulas such as Example 13, the binomial formula: Heterogeneous recurrences, which are beyond the holonomic fragment, enable proving elementary general sequence formulas such as Example 11 and the following: If we ignore the holonomic base case requirements, we can for example prove the induction steps of Abel's binomial formula and of some Stirling number identities: Here, the Stirling numbers of the second kind { k n } are one of many special nonholonomic sequences that frequently arise in combinatorics.They count the number of partitions of a k-element set into n subsets.
As further demonstration, we apply our procedure to the last equation.For convenience, we will use the name of a variable also to denote its multiplier operator.Moreover, we will use the uppercase version of the name of a variable to denote its shift operator.The defining recurrence of the Stirling numbers then reads (KN − (n + 1)N − 1){ k n } = 0 for k, n ≥ 0, where K and N denote the shift operators for the variables k and n, the first n denotes the multiplier for the variable n, and the second n is the variable itself.This recurrence is complemented by the initial values { 0 0 } = 1 and Starting from the right, the inverse m! −1 of the factorial satisfies the recurrence (mM + M − 1)m! −1 = 0 that holds for all m ∈ Z by extension.This must be found in the initialization step because there is no propagation to division.Propagation to the substitution {m → h − n} then gives the following recurrences, factored for clarity: To propagate to product, we consider { k1 n1 } and (h 2 − n 2 )! −1 with variables renamed apart.We must propagate to the substitution given by the following five operator polynomials: We added here the trivial recurrences given by H 1 − 1 and K 2 − 1 implied by the independence from h 1 and k 2 .Among the defining recurrences (4) of the compound shift indeterminates N, H, K, the recurrence H 1 H 2 − H simplifies to H 2 − H by H 1 − 1 and K 1 K 2 − K to K 1 − K by K 2 − 1.(In other words, the factorwise renaming of already disjoint variables h and k amounts to renaming in the entire product.)The third compound shift recurrence, N 1 N 2 − N , simplifies to (h 2 − n 2 )N 1 − N by N 2 − h 2 + n 2 .The part of the Gröbner basis with only compound shifts is then straightforwardly finished with the result {KN − (n 1 + 1)N − h 2 + n 2 , (h 2 − n 2 + 1)H − 1}.Hence this propagation step yields To sum over n, we first eliminate n from the previous two recurrences and conclude (H(K − h) + (N − 1)(KH − (h + 1)H + 1))({ k n }/(h − n)!) = 0.The sum has natural boundaries, meaning that the summand vanishes outside them.This guarantees that there will be no excess terms, which we also tediously discover when pulling out the indeterminates: Here, by the recurrence of the inverse of the factorial, we get (−1)!−1 = 0.So we obtain a recurrence H(K − h) h n=0 { k n }/(h − n)! = 0 for the left-hand side of our goal.For the right-hand side, we unproblematically obtain (K − h)(h k /h!) = 0. Hence H(K − h) zeros out the difference h k /h!− h n=0 { k n }/(h − n)!.The largest shift HK of the operator H(K − h) determines that the two sets of base cases h = 0 and k = 0 are sufficient for induction.

Related Work
Holonomic sequences [20] are closely related to our work.Unlike our approach, which allows infinitely many base cases as long as they are finitely representable (Sect.5), they are limited to a finite number of base cases.Relaxing this limitation yields approximately the homogeneous version of our propagation procedure (i.e., without excess terms), whose theory Chyzak, Kauers, and Salvy laid out [6].Heterogeneity amounts to module Gröbner bases [5,8,13].Its integration into propagations makes elementary identities about general sequences automatically provable, which may be of interest for general-purpose theorem provers.
In practice, hypergeometric sums are common holonomic sequences that have much faster algorithms available.Gosper's indefinite summation [9] can be applied to compute Wilf-Zeilberger pairs [19], which offer compact proof certificates for definite sum identities.These fast methods admit generalizations to the full holonomic setting.See Koutschan's thesis [11] for an overview.
Finding a closed form instead of only checking it for a summation is a different but related task.A common approach is to perform a recurrence solving phase after recurrence computation, as in the Mathematica package Sigma [1,15].

Conclusion
We presented a procedure for proving equations involving summations within an automatic higher-order theorem prover.The procedure is inspired by holonomic sequences and partly generalizes them.It expresses the problem as recurrences and derives new recurrences from existing ones.In case of success, it shows the induction step of a proof by induction, leaving the base cases to the prover.
As future work, we want to continue implementing the procedure in Zipperposition [17].We hope that the subsequent practical experiments help us to settle how side conditions of initial recurrences ought to be handled.
n + f m+1 even for negative m ∈ Z.Other finite intervals than [0, m] are expressible as differences.
Propagation to Substitution.Consider a numeral matrix a = [a kj ] kj ∈ Z d×D and a vector ⃗ b ∈ Z d .They characterize an affine substitution σ = {⃗ n → a⃗ n + ⃗ b} = {n k → D j=1 a kj n j + b k | 1 ≤ k ≤ d}.As an operator on sequences, σ performs an affine change of variables: (σf ) ⃗ n = f a⃗ n+ ⃗ b .

a1j 1 •
• • S a dj d .Consequently, the recurrences of f must be first expressed in terms of the composite shifts S j := S a1j 1 • • • S a dj d .As operators, these satisfy the (non)commutation relations where a, ⃗ c, and the matrix b are all numeric.

1 •
• • S b d d w.r.t.any monomial order.Then the values d j=1 {⃗ n ∈ N d | n j < b j } suffice for the base cases, as a union of stacked hyperplanes that is infinite unless d ≤ 1, but it corresponds to only d j=1 b j onevariable substitutions {n j → a} for 1 ≤ j ≤ d and 0 ≤ a < b j .If our eager generalization produced variables that do not participate in their induction (i.e., their b j 's are 0), they are replaced back to their original values.
≤ n and n < b if b base cases are needed.The backward induction on n < b can be transformed into an induction on N by the change of variables n → b − 1 − n.