Hankel determinants of linear combinations of moments of orthogonal polynomials, II

We present a formula that expresses the Hankel determinants of a linear combination of length $d+1$ of moments of orthogonal polynomials in terms of a $d\times d$ determinant of the orthogonal polynomials. This formula exists somehow hidden in the folklore of the theory of orthogonal polynomials but deserves to be better known, and be presented correctly and with full proof. We present four fundamentally different proofs, one that uses classical formulae from the theory of orthogonal polynomials, one that uses a vanishing argument and is due to Elouafi [J. Math. Anal. Appl. 431} (2015), 1253-1274] (but given in an incomplete form there), one that is inspired by random matrix theory and is due to Br\'ezin and Hikami [Comm. Math. Phys. 214 (2000), 111-135], and one that uses (Dodgson) condensation. We give two applications of the formula. In the first application, we explain how to compute such Hankel determinants in a singular case. The second application concerns the linear recurrence of such Hankel determinants for a certain class of moments that covers numerous classical combinatorial sequences, including Catalan numbers, Motzkin numbers, central binomial coefficients, central trinomial coefficients, central Delannoy numbers, Schr\"oder numbers, Riordan numbers, and Fine numbers.

1. Introduction. The purpose of this article is to put to the fore a fundamental formula for orthogonal polynomials that is implicitly hidden in the classical literature on orthogonal polynomials. It is so well hidden that seemingly even top experts of the theory of orthogonal polynomials are not aware of the formula. How and why this is possible is explained in greater detail in Section 2. My literature search led me to discover that the formula is stated in Lascoux's book [11], albeit incorrectly, but with a correct proof. Subsequently, I realised that the formula is stated correctly by Elouafi in [6], however with an incomplete proof. Finally, in reaction to [10], Arno Kuijlaars pointed out to me that the formula appears in [1], in which a result due to Brézin and Hikami [2] is cited. Both papers contain correct statement and (different) proofs, however they use random matrix language. Again, see Section 2 for more details.
So, let me present this formula without further ado. Let p n (x) n≥0 be a sequence of monic polynomials over a field K of characteristic zero 1 with deg p n (x) = n, and assume that they are orthogonal with respect to the linear functional L, i.e., they satisfy L(p m (x)p n (x)) = ω n δ m,n with ω n = 0 for all n, where δ m,n is the Kronecker delta. Furthermore, we write µ n for the n-th moment L(x n ) of the functional L, for which we also use the umbral notation µ n ≡ µ n . 2 Theorem 1. Let n and d be non-negative integers. Given variables x 1 , x 2 , . . . , x d , and using the above explained umbral notation, we have det 0≤i,j≤n−1 Here, determinants of empty matrices and empty products are understood to equal 1.
Remark. The theory of orthogonal polynomials guarantees that in our setting (namely due to the condition ω n = 0 in the orthogonality) the Hankel determinant of moments in the denominator on the left-hand side of (1.1) is non-zero.
We may rewrite (1.1) using quantities that appear in the three-term recurrence p n (x) = (x − s n−1 )p n−1 (x) − t n−2 p n−2 (x), for n ≥ 1, (1.2) with initial values p −1 (x) = 0 and p 0 (x) = 1, that is satisfied by the polynomials according to Favard's theorem (see e.g. [8,) for some sequences (s n ) n≥0 and (t n ) n≥0 of elements of K with t n = 0 for all n. Namely, using the well-known fact (see e.g. [ 1 For the analyst, (usually) this field is the field of real numbers, and a further restriction is that the linear functional L is defined by a measure with non-negative density. However, the formulae in this paper do not need these restrictions and are valid in this wider context of "formal orthogonality".
This form reveals that we may regard the formula as a polynomial formula in the x i 's and the s i 's and t i 's. Indeed, the determinant on the right-hand side, being a skew-symmetric polynomial in the x i 's, is divisible by the Vandermonde product in the denominator.
In the next section, I will present the history of Theorem 1, from a strongly biased (personal) view. As I explain there, I discovered the formula on my own while thinking about Conjecture 8 in [4], and also came up with a proof, presented here in Section 3. Later I found the earlier mentioned occurrences of the formula in [11], [6], [1] and [2]. Lascoux's argument (the one in [1] is essentially the same), which follows the classical literature of orthogonal polynomials (but is presented in [11] in his very personal language), is presented in Section 4 (in "standard" language). Section 5 brings the completion of Elouafi's vanishing argument. The random matrix-inspired proof due to Brézin and Hikami is the subject of Section 6.
Sections 7 and 8 address issues that come from my initial motivation (and Elouafi's) that in the end led to the discovery of Theorem 1: Hankel determinants of linear combinations of combinatorial sequences. Section 7 addresses the case in which in (1.1) the x i 's are all equal to each other. In that case, it is the limit formula in Proposition 5 in Section 5 that has to be applied. We show in Section 7 that Elouafi's recurrence approach for that case can be replaced by an approach yielding completely explicit expressions. Finally, in Section 8 we show that the theory of linear recurrent sequences with constant coefficients implies that, in the case where the coefficients s i and t i in the three-term recurrence (1.2) are constant for large i, the scaled Hankel determinants of linear combinations of moments on the left-hand side of (1.1) satisfy a linear recurrence with constant coefficients of order 2 d , plus some more specific assertions about the coefficients in this linear recurrence, see Corollary 9. This proves conjectures from [5], vastly generalising them.
2. History of Theorem 1 -a (very) personal view. I discovered Theorem 1 on my own, in a very roundabout way. It started with an email of Johann Cigler in which he asked me for a proof of a special case of (2.1) We quickly realised that we can actually prove the above identity, which became the first main result in [4] (see Theorem 1 there; the reader should notice that the left-hand side of (2.1) agrees with the left-hand side of (1.1), while the right-hand sides do not agree; in retrospect, the equality of the right-hand sides is equivalent to the Christoffel-Darboux identity, cf. [14, Theorem 3.2.2]). We then proceeded to derive a (more complicated) triple-sum expression for the "next" case (see [4,Theorem 5]), In the special case where the s i 's and the t i 's are constant for i ≥ 1, the orthogonal polynomial p n (x) can be expressed in terms of a linear combination of Chebyshev polynomials (see [4,Eq. (4.2)]). This allowed us to evaluate the sum on the right-hand side of (2.1) and the afore-mentioned triple sum. We recognised a pattern, and this led us to conjecture a precise formula for det 0≤i,j≤n−1 [4,Conj. 8]), again expressed in terms of Chebyshev polynomials. Subsequently, I realised that this conjectural expression could be simplified (by means of [4,Eq. (4.2)]). The result was the right-hand side of (1.1), in the special case where the s i 's and t i 's are constant for i ≥ 1. The obvious question at that point then was: does Formula (1.1) also hold if the s i 's and t i 's are generic? Computer experiments said "yes".
At this point I told myself: this identity, being a completely general identity of fundamental nature connecting orthogonal polynomials and their moments, must be known. Naturally, I consulted standard books on orthogonal polynomials, such as Szegő's classic [14], but I could not find it. After a while I then started to think about a proof. I figured out the proof of (1.1) that can be found in Section 3.
Still, I had the strong feeling that this identity must be known. So, if I cannot find it in classical sources, what about "non-classical" sources? I remembered that Alain Lascoux had devoted one chapter of his book [11] on symmetric functions to orthogonal polynomials, revealing there that orthogonal polynomials can be seen as Schur functions of quadratic shapes, and demonstrating that formal identities for orthogonal polynomials can be conveniently established by adopting this point of view. So I consulted [11], and I quickly realised that Proposition 8.4.1 in the book addresses (1.1); it needed some more work to see what exactly was contained in that proposition. 3 Lascoux attributes his proposition to Christoffel, without any specific reference. With this information in hand, I returned to Szegő's book [14] and made a text search for "Christoffel". I finally found the relevant theorem: Theorem 2.5. I believe that, having a look at that theorem, the reader will excuse myself that I did not recognise that theorem as the one that is relevant for our Theorem 1 on my first attempt. 4 In particular, [14,Theorem 2.5] does not say anything about the proportionality factor between the two sides in (1.1) (as opposed to Lascoux, even if the expression he gives is not correct; he does provide an argument though 5 ). Szegő tells that [14,Theorem 2.5] is due to Christoffel [3], but only in the special case of Legendre polynomials (indeed, at the end of [3] there appears Theorem 1 in that special case), a fact that also seems to have escaped many researchers in the theory of orthogonal polynomials.
Eventually, I found that Theorem 1 appears, correctly stated, as Theorem 1 in the 3 Lascoux's book is written in the (for many: foreign) language of plethystic operators on symmetric polynomials. I therefore provide a translation of the parts of [11] that are relevant to our discussion into "standard language" further below in Subsection 2.1. 4 The reader may judge her/himself: I present the theorem further below in Subsection 2.2, together with explanations how this connects to our discussion. 5 Lascoux refers to "the Bazin formula" without any reference. He may be excused for that: a glance at the index of [11] leads one to [11, relatively recent article [6] by Elouafi. However, the proof given there is incomplete. 6 I present a completion of this proof in Section 5.
With all this knowledge, I consulted Mourad Ismail and asked him if he knows the formula, respectively can refer me to a source in the literature. He immediately pointed out that the right-hand side determinant of (1.1) features in "Christoffel's theorem" about the orthogonal polynomials with respect to the measure defined by the density the density of the original orthogonality measure), that is, in [14, Theorem 2.5] respectively [7, Theorem 2.7.1]. However, the conclusion of a longer discussion was that he had not seen this formula earlier.
Finally, when I posted [10] (containing the extension of (1.1) to a rational deformation of the density dµ(x)) on the arχiv, Arno Kuijlaars brought the article [1] to my mind. Indeed, Equation (2.6) in [1] is equivalent to (1.1), and it is pointed out there that this result had been earlier obtained by Brézin and Hikami in [2,Eq. (14)]. It requires some translational work to see this though, see Subsection 2.3.
In the next subsection, I provide a translation, into "standard English", of Lascoux's rendering of Theorem 1. Then, in Subsection 2.2, I present "Christoffel's theorem" and explain its connection to Theorem 1. Finally, in Subsection 2.3 I translate the random matrix result [2,Eq. (14)] into the language that we use here to see that it is indeed equivalent to (1.1). [11]. This proposition says that, given alphabets

Lascoux's Proposition 8.4.1 in
is proportional, up to a factor independent of B, to det 1≤i,j≤k+1 The proportionality factor is given in the proof of [11,Prop. 8.4.1] (except for the overall sign 8 ), but it has not been correctly worked out.
In order to understand the connection with Theorem 1, let me translate Lascoux's language to the notation that I use here in this paper. First of all, as mentioned in Footnote 7, the symbol ∆(B) denotes the Vandermonde product (see [11, bottom of p. 11]) 6 The proof of Lemma 4 in [6] only works for pairwise distinct α i 's. It is probably possible to complete the argument even with an accordingly weakened version of that lemma. In our completion of Elouafi's proof in Section 5 we prefer to complete the proof of the lemma. 7 In the statement of [11,Prop. 8.4.1], the resultant R(x, B) must be replaced by the Vandermonde product ∆(x + B), as is done in the proof of that proposition in [11]. I have done this correction here. Furthermore I simplified the statement by incorporating the variable x in the alphabet B, which means to replace "−B − x" by "−B" and "x + B" by "B". It is obvious that Lascoux was aware of this simplification. However, he needed to formulate the statement in that particular way in order to relate it to the classical result [14,Theorem 2.5] (see also [7,Theorem 2.7.1]) in the theory of orthogonal polynomials. 8 Lascoux's attitude towards signs is best described by himself: " . . . with signs that specialists will know how to write." [11, comment added Next, S r s (C) is the Schur function of rectangular shape r s = (r, r, . . . , r) (with s occurrences of r) in the alphabet C (not to be confused with the complex numbers!), defined by (see [11,Eq. (1.4

.3)])
S r s (C) = det 1≤i,j≤s where S a (C) is the complete homogeneous symmetric function of degree a in the alphabet C. Thus, Here (this is implicit on page 6 of [11]), on the left-hand side of (1.1) (with x ℓ replaced by −b ℓ and d replaced by k + 1). On the other hand, Lascoux's polynomials P n (x) are orthonormal with respect to the linear functional L with moments µ m , m = 0, 1, . . . , and are given by (cf. [11, while "our" monic orthogonal polynomials p n (x) are given by (cf. [ Thus, we see that Lascoux's determinant det 1≤i,j≤k P n−1+j (b j ) is, up to some overall factor, "our" determinant det 1≤i,j≤k p n−1+j (b j ) on the right-hand side of (1.1) (with x ℓ replaced by −b ℓ and d replaced by k + 1).
It should now be clear to the reader that Lascoux's Proposition 8.4.1 in [11] is equivalent to Theorem 1, except that he did not bother to figure out the correct sign, and that he did not get the proportionality factor right (both of which being very understandable given the complexity of the task . . . ; in fact, in order to not risk to also fail, I do not attempt to present the correct proportionality factor or sign in Lascoux's notation -in "standard" notation, the correct identity is given in (1.1).) form a sequence of orthogonal polynomials with respect to the linear functional with moments The reader may now understand why I did not notice that Theorem 1 is hidden in the above assertion on my first attempt to find it in Szegő's book [14]. I did of course see that the determinant in (2.2) is our determinant on the right-hand side of (1.1) (with x d = x), and that the moments in (2.3) are "almost" the entries in the Hankel determinant on the left-hand side of (1.1). There are however two obstacles to overcome in order to "extract" (1.1) out of the above assertion: first, one has to recall a certain determinant formula for the orthogonal polynomials with respect to a given moment sequence, namely Lemma 4 in Section 4. Applied to the moments in (2.3), it produces indeed the Hankel determinant on the left-hand side of (1.1). The uniqueness of orthogonal polynomials up to scaling then implies that the determinants on the left-hand side of (1.1) (with x d = x) and the expression (2.2) agree up to a multiplicative constant (meaning: independent of x = x d ). So, second, this constant has to be computed. A researcher in the theory of orthogonal polynomials does not really care about the normalisation of the orthogonal polynomials and this seems to be the reason why apparently nobody has done it there, although this is not really difficult, see Sections 4 and 5 for two slightly different arguments.

Expectation of a product of characteristic polynomials of random
Hermitian matrices. Let dµ(u) be the density of some positive measure with infinite support all of whose moments exist. Equation (14) in [2] (cf. also [ . (2.4) (I have changed notation so that it is in line with our notation.) Here, the left-hand side is an expectation for products of characteristic polynomials of random Hermitian matrices.
3. First proof of Theorem 1 -condensation. In this section, we present the author's proof of Theorem 1, which uses the method of condensation (frequently referred to as "Dodgson condensation"). This method provides inductive proofs that are based on a determinant identity due to Jacobi (see Proposition 2 below).
For convenience, we change notation slightly. Instead of the polynomials p n (x), let us consider the polynomials f n (x) defined by (That is, we got rid of the signs on the right-hand side of (1.1).) Our proof of (3.2) will be based on the method of condensation (see [8,Sec. 2.3]). The "backbone" of this method is the following determinant identity due to Jacobi.

Proposition 2. Let
A be an N × N matrix. Denote the submatrix of A in which rows i 1 , i 2 , . . . , i k and columns j 1 , j 2 , . . . , j k are omitted by A j 1 ,j 2 ,...,j k i 1 ,i 2 ,...,i k . Then we have The second ingredient of the proof of (3.2) will be the Hankel determinant identity below, which, as its proof will reveal, is actually a consequence of the condensation formula in (3.3).
Lemma 3. Let (c n ) n≥0 be a given sequence, and α and β be variables. Then, for all positive integers n, we have Proof. By using multilinearity in the rows, it is easy to see (cf. also [9,Lemma 4]) that det 0≤i,j≤M where χ(S) = 1 if S is true and χ(S) = 0 otherwise. If we apply this identity to the first determinant on the left-hand side of (3.4), then we obtain det 0≤i,j≤n−1 Now we use multilinearity in rows 0, 1, . . . , r − 1 and in rows r, r + 1, . . . , n − 1 separately. This leads to det 0≤i,j≤n−1 We substitute this as well as (3.5) (wherever it can be applied) in (3.4). The factor β − α cancels. Subsequently, we compare coefficients of α s β t+1 , respectively of α t+1 β s , for 0 ≤ s ≤ t ≤ n. Thus we see that we need to show det 0≤i,j≤n−1 and this would moreover be sufficient for the proof of (3.4). As it turns out, the choice of We are now ready for the proof of Theorem 1, which, as we have seen, is equivalent to (3.2).
Proof of (3.2). We prove (3.2), in the form det 1≤i,j≤d . For the induction step, we observe that, according to (3.3) This can be seen as a recurrence formula for LHS d,n (x 1 , . . . , x d ), as one can use it to express LHS d,n (x 1 , . . . , x d ) in terms of expressions of the form LHS e,m (x a , . . . , x b ) with e smaller than d. Hence, for the proof of (3.7) it suffices to verify that the right-hand side of (3.7) satisfies the same recurrence. Consequently, we substitute this right-hand side in (3.8).
After cancellation of factors that are common to both sides, we arrive at This is the special case of Lemma 3 where c i = µ i d−1 ℓ=2 (x ℓ + µ), α = x 1 and β = x d .
4. Second proof of Theorem 1 -theory of orthogonal polynomials. Here we describe a proof of Theorem 1 that is based on facts from the theory of orthogonal polynomials. We follow largely Lascoux's arguments in the proof of Proposition 8.4.1 in [11]. They show that, using the uniqueness up to scaling of orthogonal polynomials with respect to a given linear functional, the right-hand side and the left-hand side in (1.1) agree up to a multiplicative constant. For the determination of this constant we provide a simpler argument than the one given in [11].
We prove (1.1) in the form det 0≤i,j≤n−1 It should be recalled that, with L denoting the functional of orthogonality for the polynomials p n (x)) n≥0 , we have L(x n ) = µ n , where we still use the umbral notation µ n ≡ µ n . We start with a classical fact from the theory of orthogonal polynomials (cf.   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . It is straightforward to check that the determinant in the last line is orthogonal with respect to 1, x, x 2 , . . . , x n−1 . Moreover, the coefficient of x n is ± det 0≤i,j≤n−1 (ν i+j ), which by assumption is non-zero so that the determinant in the assertion of the lemma is a polynomial of degree n, as desired.
Remark. The determinant in the last line of (4.2) represents another classical determinantal formula for orthogonal polynomials expressed in terms of the moments of the corresponding linear functional of orthogonality, see [ Proof of (4.1). Using Lemma 4 with ν n = µ n d−1 ℓ=1 (µ − x ℓ ), we see that the determinants in the numerator of the left-hand side of (4.1), seen as polynomials in x d , are a sequence of orthogonal polynomials for the linear functional with moments Clearly, in terms of the functional L (now acting on polynomials in x d ) of orthogonality for the polynomials p n (x d ) n≥0 , this linear functional with moments (4.3) can be expressed as We claim that also the right-hand side of (4.1) gives a sequence of orthogonal polynomials (in x d ) with respect to the linear functional (4.4). The first (and easy) observation is that the right-hand side of (4.1) has indeed degree n in x d .
Let us denote the right-hand side of (4.1) by q n (x d ). When we apply the functional (4.4) to x s d q n (x d ), for 0 ≤ s ≤ n − 1, then, up to factors which are independent of x d , we obtain By expanding the determinant with respect to the last column, this becomes a linear combination of terms of the form L(x s d p n+i−1 (x d )). Since i ≥ 1 and s ≤ n − 1, all of them vanish, proving our claim.
By symmetry, the same argument can also be made for any The fact that orthogonal polynomials with respect to a particular linear functional are unique up to multiplicative constants then implies that det 0≤i,j≤n−1 where C is independent of the variables x 1 , x 2 , . . . , x d . In order to compute C, we divide both sides by x n 1 x n 2 · · · x n d , and then compute the limits as x d → ∞, . . . x 2 → ∞, x 1 → ∞, in this order. It is not difficult to see that in this manner the above equation reduces to det 0≤i,j≤n−1 where A is a lower triangular matrix with ones on the diagonal. Hence, we get C = (−1) nd det 0≤i,j≤n−1 (µ i+j ), as desired.
5. Third proof of Theorem 1 -vanishing of polynomials. The purpose of this section is to present a completed version of Elouafi's proof of Theorem 1 in [6]. It is based on a vanishing argument: it is shown that the left-hand side of (1.1) vanishes if and only if the right-hand side does. Since both sides have the same leading monomial as polynomials in the x i 's, it follows that they must be the same up to a multiplicative constant. This constant is then determined in the last step.
To begin with, we need some preparations. Let us write for the expression on the right-hand side of (1.1), forgetting the sign. Since the numerator is skew-symmetric in the x i 's, it is divisible by the Vandermonde product 1≤i<j≤d (x i −x j ) in the denominator, so that R(x 1 , x 2 , . . . , x d ) is actually a (symmetric) polynomial in the x i 's. Thus, while in its definition it seems problematic to substitute the same value for two different x i 's in R(x 1 , x 2 , . . . , x d ), this is actually not a problem. Nevertheless, it would also be good to have an explicit form for such a case available as well. This is afforded by the proposition below.
Since we know that R(x 1 , x 2 , . . . , x d ) is in fact a polynomial in the x i 's, we have a large flexibility of how to compute this limit. We choose to do it as follows: we put x i = y 1 + ih for i = 1, 2, . . . , m 1 , x i = y 2 + ih for i = m 1 + 1, m 1 + 2, . . . , m 1 + m 2 , etc. In the end we let h tend to zero.
We describe how this works for the first group of variables, for the other the procedure is completely analogous. In the matrix appearing in the numerator of the definition of R(x 1 , x 2 , . . . , x d ), we replace column j by j k=1 (−1) j−k j − 1 k − 1 (column k), for j = 1, 2, . . . , m 1 .
Clearly, this modification of the matrix can be achieved by elementary column operations, so that the determinant is not changed. Thus, we obtain where .
(Here, the terms describe the columns of the matrix N 1 .) We now perform the earlier described assignments for x 1 , x 2 , . . . , x m 1 . Under these assignments, we have n+i−1 (−y 1 ).
Therefore we get To finish the argument, one has to proceed analogously for the remaining groups of x i 's and finally put the arising factorials into the columns of the determinant.
Furthermore, as before, we write L for the linear functional with respect to which the sequence p n (x) n≥0 is orthogonal. Then det 0≤i,j≤n−1 Proof. We first rewrite the determinant on the left-hand side, Consequently, letting c ma = 0 whenever m < a, we have det 0≤i,j≤n−1 Since A and C are triangular matrices with ones on the main diagonal (the latter follows from our assumption that the polynomials p m (x) are monic), we conclude that det A = det C = 1. The combination of (5.3) and (5.4) then establishes the desired assertion.
The following lemma is [6,Lemma 4], which appears there with an incomplete proof (cf. Footnote 6).

Lemma 7.
As polynomial functions in the variables x 1 , x 2 , . . . , x d , the expression vanishes in an extension field K of the ground field K if and only if the determinant vanishes.
Proof. Let R(x 1 , x 2 , . . . , x d ) = 0, for some choice of elements x i in K. We assume that {x 1 , x 2 , . . . , x d } = {y 1 , y 2 , . . . , y e }, with the y i 's being pairwise distinct, and where y i appears with multiplicity m i among the x j 's, i = 1, 2, . . . , e. Since both R(x 1 , x 2 , . . . , x d ) and S(x 1 , x 2 , . . . , x d ) are symmetric polynomials in the x j 's, without loss of generality we may assume that as a polynomial in x. Hence, there exists another polynomial h(x) such that

Now, by inspection, g(x)
is a polynomial of degree at most n + d − 1, while q(x) is a polynomial of degree d. Hence, h(x) is a polynomial of degree at most n − 1, which therefore can be written as a linear combination for some constants r i ∈ K, i = 0, 1, . . . , n − 1. By its definition, g(x) is orthogonal to all p j (x) with 0 ≤ j ≤ n − 1. In other words, we have Equivalently, the rows of the matrix L q(x)p i (x)p j (x) 0≤i,j≤n−1 are linearly dependent, and consequently the determinant on the right-hand side of (5.2) vanishes, which in its turn implies that the determinant on the left-hand side of (5.2), which is equal to the determinant in (5.6), vanishes, as desired.
Conversely, let the determinant in (5.6) be equal to zero, for some choice of x ′ i s in K. Again, without loss of generality we assume that the first m 1 of the x i 's are equal to y 1 , the next m 2 of the x i 's are equal to y 2 , . . . , and the last m e of the x i 's are equal to y e , the y j 's being pairwise distinct. These assumptions imply again that Using the equality of Lemma 6, we see that the determinant on the right-hand side of (5.2) must vanish. Thus, the rows of the matrix on this right-hand side must be linearly dependent so that there exist constants r i ∈ K, i = 0, 1, . . . , n − 1, not all of them zero, such that 0 = n−1 i=0 r i L q(x)p i (x)p j (x) , for j = 0, 1, . . . , n − 1. Now consider the polynomial g(x) = n−1 i=0 r i q(x)p i (x). This is a non-zero polynomial of degree at most n + d − 1 which, by the last identity, is orthogonal to p j (x), for j = 0, 1, . . . , n − 1. Hence there must exist constants c i ∈ K such that On the other hand, by the definition of g(x), we have g (j−1) (−y k ) = 0, for k = 1, 2, . . . , e and j = 1, 2, . . . , m k .
We can now complete the proof of Theorem 1.
Proof of Theorem 1. By Lemma 7 we know that the symmetric polynomials R(x 1 , x 2 , . . . , x d ) and S(x 1 , x 2 , . . . , x d ) vanish only jointly. If we are able to show that in addition both have the same highest degree term then they must be the same up to a multiplicative constant. Indeed, the highest degree term in R(x 1 , x 2 , . . . , x d ) is obtained by selecting the highest degree term in each entry of the matrix in the numerator of the fraction on the right-hand side of (5.5). Explicitly, this highest degree term is det 1≤i,j≤d by the evaluation of the Vandermonde determinant. On the other hand, the highest degree term in S(x 1 , x 2 , . . . , x d ) is obtained by selecting the highest degree term in each entry of the matrix in the numerator of the fraction on the right-hand side of (5.6). Explicitly, this highest degree term is Both observations combined, we see that with C = (−1) nd det 0≤i,j≤n−1 (µ i+j ). This finishes the proof of the theorem.
6. Fourth proof of Theorem 1 -Heine's formula and Vandermonde determinants. The subject of this section is the random matrix-inspired proof of Theorem 1 due to Brézin and Hikami [2]. Among all the proofs of Theorem 1 that I present in this paper, it is the only one that does not need the knowledge of the formula beforehand. Rather, starting from the left-hand side, by the clever use of Heine's formula -given in Lemma 8 below -and some obvious manipulations, one arrives almost effortlessly at the right-hand side. The meaning of the formula in the context of random matrices has been indicated in Subsection 2.3, and more specifically in (2.4) which tells that the right-hand side of (1.1) can be interpreted as an expectation of products of characteristic polynomials of random Hermitian matrices. The random matrix flavour of the calculations below is seen in the ubiquitous multivariate density which is, up to scaling, the density function for the eigenvalues of random Hermitian matrices.
We prove (1.1) in the form det 0≤i,j≤n−1 ). Its proof is short enough that we provide it here for the sake of completeness. The integral that appears in the formula can equally well be understood in the analytic or in the formal sense.
Lemma 8. Let dν(u) be a density with moments ν s = u s dν(u), s = 0, 1, . . . . For all non-negative integers n, we have det 0≤i,j≤n−1 Proof. We start with the left-hand side, det 0≤i,j≤n−1 If in this expression we permute the u i 's, then it remains invariant, except for a sign that is created by the determinant. Let S n denote the group of permutations of {0, 1, . . . , n − 1}.
If we average the above multiple integral over all possible permutations of the u i 's, then we obtain det 0≤i,j≤n−1 In view of the evaluation of the Vandermonde determinant, up to a shift in the indices of the u i 's, this is the right-hand side of (6.2).
By orthogonality of the p j (x)'s, we have u s p j−1 (u) dµ(u) = 0 for 0 ≤ s < n ≤ j − 1. Since the expansion of the Vandermonde determinant 1≤i<j≤n (u j − u i ) consists entirely of monomials ±u s 1 1 u s 2 2 · · · u s n n with s i < n for all i, we see that in the above expression we may replace the determinant in the integrand by If we substitute this above and use (6.4) again, we get det 0≤i,j≤n−1 This gives indeed (6.1) once we apply Lemma 8 another time, now with dν(u) = dµ(u).
7. Hankel determinants of linear combinations of Motzkin and Schröder numbers. As described in Section 2, the origin of the author's discovery of Theorem 1 has been the interest in the evaluation of Hankel determinants of linear combinations of combinatorial sequences. Elouafi in [6] has the same motivation. The point of (1.1) in this context is that it provides a compact formula for n × n Hankel determinants of a fixed linear combination of d + 1 successive elements of a (moment) sequence (the left-hand side of (1.1)) that does not "grow" with n. (The right-hand side is a "fixed-size" formula for fixed d; the dependence on n is in the index of the orthogonal polynomials.) Elouafi provides numerous applications of Theorem 1 to the evaluation of Hankel determinants of linear combinations of Catalan, Motzkin, and Schröder numbers in [6,Sec. 3]. However, his treatment of Motzkin and Schröder numbers can be replaced by a better one.
The reader should recall that, if some of the x i 's in (1.1) should be equal to each other, then on the right-hand side we would have to use Proposition 5 in order to make sense of the right-hand side of (1.1). In order to apply the proposition, we must evaluate derivatives of the orthogonal polynomials at −x j . How to accomplish this for the orthogonal polynomials corresponding to Motzkin and Schröder numbers as moments by a recursive approach, is described in [6,Secs. 3.2 and 3.3]. Here we show that one can be completely explicit. We do not try to treat the most general case but rather restrict ourselves to illustrate the main ideas by two examples.
By Theorem 1 with x i = 0 for all i and Proposition 5 with y 1 = 0 and m 1 = d, we have (see also [6,Eq. (1.3) We consider first the special case where µ n = M n , the n-th Motzkin number, defined by (cf. [13,Ex. 6.37]). It is well-known (see [15] or [6, p. 1265]) that the associated orthogonal polynomials satisfy the three-term recurrence (1.2) with s i = t i = 1 for all i. More explicitly, they are p n (x) = U n x− 1 2 , where U n (x) is the n-th Chebyshev polynomial of the second kind, which is defined by From this generating function, we may now easily obtain an explicit expression for p (j) Consequently, we get Comparison of coefficients of z n then yields If we recall that det 0≤i,j≤n−1 (M i+j ) = 1 (by (1.3) and the choice t i = 1 for all i), then we arrive at the identity det 0≤i,j≤n−1 If we recall that det 0≤i,j≤n−1 (r i+j ) = 2 ( n 2 ) (by (1.3) and the choice t i = 2 for all i), then we arrive at the identity det 0≤i,j≤n−1 As a final remark, we point out that the above treatment of the "Motzkin case" is one which only applies in a specific situation, whereas the above treatment of the "Schröder case" works for all moment sequences for orthogonal polynomials which are generated by a three-term recurrence (1.2) where the coefficient sequences (s i ) i≥0 and (t i ) i≥0 become constant eventually, that is, where s i ≡ s and t i ≡ t for large enough i. For, under this assumption, the generating function n≥0 p n (x)z n for the orthogonal polynomials p n (x) n≥0 is a rational function with denominator 1 − (x − s)z + tz 2 , a quadratic polynomial, as in the special case of Schröder numbers, where we had s = 3 and t = 2.
Corollary 9. Within the setup in Section 1, let s i ≡ s and t i ≡ t for i ≥ 1. Furthermore, let where the λ k 's are some constants 9 in the ground field K, k = 0, 1, . . . , d − 1, and λ d = 1.
Then the sequence (H n ) n≥0 of (scaled) Hankel determinants of linear combinations of moments satisfies a linear recurrence of the form for some constants c i ∈ K, normalised by c 0 = 1. Explicitly, these constants can be computed as the coefficients of the characteristic polynomial (in x) In particular, we have and (2) A small detail is that the proof of Corollary 9 given below shows that, if t i ≡ t for all i, then the recurrence (8.2) holds even for n ≥ 2 d .
Furthermore, an inspection of the proof of the corollary shows that, if s i ≡ s for i > N and t i ≡ t for i ≥ N , where N is some positive integer, then the recurrence (8.2) still holds, but only for n ≥ 2 d + N .
(3) The formula (8.4) for the coefficient c 1 is a far-reaching generalisation of Conjecture 6 in [5], while the symmetry relation (8.5) is a far-reaching generalisation of Conjecture 7 in [5].
(4) In view of the specialisations listed in items (i)-(xi) and (xiv)-(xviii) in the list given in [4,Sec. 4], Corollary 9 implies that the (properly scaled) Hankel determinants of linear combinations of numerous combinatorial sequences satisfy a linear recurrence with constant coefficients, which, aside from the already mentioned Catalan numbers, include Motzkin numbers, central binomial coefficients, central trinomial coefficients, central Delannoy numbers, Schröder numbers, Riordan numbers, and Fine numbers.
Proof of Corollary 9. We use Theorem 1 in the equivalent form (3.2). In order to apply this identity, we write d j=0 with the x i 's in the algebraic closure of our ground field K. Equivalently, λ k = e d−k (x 1 , x 2 , . . . , x d ).
For the moment, we assume that the x i 's are pairwise distinct in order to avoid a zero denominator in (3.2). We will get rid of this restriction in the end by a limiting argument. (An alternative would be to base the arguments on Proposition 5. This would however be more complicated.) What (3.2) affords is to express H n in terms of a d × d determinant with entries f n+i−1 (x j ), 1 ≤ i, j ≤ d. If we expand the determinant on the right-hand side of (3.2) according to its definition, then we obtain a linear combination of products of the form Hence, each product sequence satisfies the same recurrence relation, namely the one resulting from the (Hadamard) product of the recurrences (8.6) over i = 1, 2, . . . , d. From the proof of [13, Theorem 6.4.9] (which is actually a much more general theorem), it follows immediately that the order of this "product" recurrence is at most 2 d .
In order to obtain the explicit description of the "product" recurrence in the statement of the corollary, we have to recall the basics of the solution theory of (homogeneous) linear recurrences with constant coefficients. This theory says that one has to determine the zeroes of the characteristic polynomial of the recurrence; the powers α n of the zeroes α multiplied by powers n e of n, where the exponent e is less than the multiplicity of α, form a basis of the solution space of the recurrence.
The characteristic polynomial of the recurrence (8.6) is which is also equal to the characteristic polynomial of the 2 × 2 matrix Let y i,1 and y i,2 be the zeroes of the polynomial (8.7). Then the powers y n i,1 n≥0 and y n i,2 n≥0 form a basis of solutions to the recurrence (8.6) if y i,1 and y i,2 are distinct, while otherwise the sequences y n i,1 n≥0 and ny n i,1 n≥0 form a basis. The product recurrence that we want to find is one for which all products n e−d d i=1 y n i,ε i n≥0 are solutions, for ε i ∈ {1, 2} and e bounded above by the sum of the multiplicities of the y i,1 's. Equivalently, the characteristic polynomial of the desired product recurrence is one which has all products d i=1 y i,ε i , where ε i ∈ {1, 2}, as zeroes, with multiplicities equal to the sum of the multiplicities of the y i,1 's minus d − 1. It is a simple fact of linear algebra that such a polynomial is the characteristic polynomial of the tensor product of all matrices (8.8), that is, the matrix in (8.3). (For, the eigenvalues of the tensor product of matrices are all products of eigenvalues of the individual matrices.) This proves the assertion about the explicit form of the recurrence (8.2) in the case of pairwise distinct x i 's.
Since everything -the expressions in (3.2), the coefficients of the recurrence (8.2)is polynomial in the x i 's, we may finally drop that restriction.
The coefficient c 1 is the coefficient of x 2 d −1 in the characteristic polynomial of (8.3). It is easy to see that this is Finally, the symmetry relation (8.5) is a consequence of the inherent symmetry of a linear recurrence of order 2. In order to make this visible, let f n (x) = t −n/2 f n (x). Then, from (3.1) we see that f n (x) − (x + s)t −1/2 f n−1 (x) + f n−2 (x) = 0, for n ≥ 3. (8.9) Now, a recurrence can be read in the forward direction -that is, we compute the nth term of the sequence from lower order terms -or in the backward direction -that is, we compute the n-th term from higher order terms. In this sense we see that the recurrence (8.9) is the same regardless whether it is read in forward or backward direction. Consequently, the recurrence relation for the Hadamard product d i=1 f n (x i ) must also be symmetric, that is, the same regardless whether it is read in forward or in backward direction. If one then substitutes back f n (x) = t −n/2 f n (x) in that symmetric recurrence, the relation (8.5) follows.