Historical Background. One of the main problems considering orthogonality of polynomials in several variables, which are usually indexed by multi-indices, is to make them well ordered. This can be done case by case, usually by means of lexicographical order. In [2], we propose to overcome this disaccord by introducing an algebraic approach indexing the polynomials in question by and ideal which in fact is related to the support of the orthogonality measure (if it exists). This way, we obtain the universal means for treating the polynomials. In the current paper, the main topic is to consider the Christoffel–Darboux formula within our framework.

The recent progress in orthogonal polynomials in several variables entails the three-term relation, which in this case requires organizing the polynomials in columns and employing the matrix coefficients in place of ordinary numbers. Results of this type were obtained in [5, 16], and later in [2] where the authors managed to include the case of orthogonal polynomials with respect to measures with thin support, where by “thin”, we mean contained in a proper algebraic subset of \({\mathbb {R}}^n\). Knowing the three-term relation, we may formulate criteria for existence of a Borel measure, orthonormalizing the system of polynomials. Another kind of application is possibility of writing the Christoffel–Darboux formula, which is well known for polynomials in a single real variable, and highly expected to hold in several variables too. Our paper answers this call in the way it seems to be possible in the context of polynomials in several variables—and not only possible though intrinsically reasonable. Since the equations considered here hold mostly modulo an ideal, it is impossible to perform division by a polynomial, as this happens in the case for one-variable polynomials. This justifies our approach with no division involved. One of the surprising effects is that the Christoffel–Darboux formula requires equations modulo another ideal in twice as many variables, which means “doubling” the starting ideal. This has made us consider some abstract ideas which turn out to be indisposable in the process of deriving the Christoffel–Darboux formula in question. The usual way is to go from the three-term relation to the Christoffel–Darboux formula, and this has motivated us to consider some examples leading to the relation in question (or, more specifically, showing the possible way of obtaining it recursively). Among the examples, there are polynomials which are orthogonal with respect to a measure supported in the unit circle or the Bernoulli lemniscate, though one can work out more cases like [4, 9].

Our paper may be considered a challenge to the opinion appearing in [12, p. 299] that “the conventional wisdom is that there is no CD formula for general OPs”. This, however, is less important motivation, as our primary one is to draw attention to the Christoffel–Darboux formula itself, which surprisingly reduces an expression involving several initial orthogonal polynomials to the formula with only two last ones (except that for several variables, things get more complicated).

A nice by-product we include is a very concise version of a theorem stated in [2] which here appears as Theorem 2. The references we list at the end could be easily made at least twice as long so if needed, the interested reader is encouraged to confront the references in [2].

1 Introduction

Let \({\mathbb {N}}=\{0,1,2,\ldots \}\). We make use of the Kronecker delta \(\delta _{ab}\) which is equal to 1 if \(a=b\) or 0 otherwise, where \({\varvec{a}}\) and \({\varvec{b}}\) are any objects. All measures on \({\mathbb {R}}^d\) are assumed to be Borel measures with all moments finite, i.e., \(\int _{{\mathbb {R}}^d} \Vert x\Vert ^n \mathrm d\mu (x) <\infty \) for all \(n\geqslant 0\). Let \({\mathcal {P}}_d\) stand for the space (or rather the algebra) of all complex polynomials in \({\varvec{d}}\) (real) variables, and \({\mathcal {P}}_d^k\)—for the space of all complex polynomials in \({\varvec{d}}\) variables of degree at most \({\varvec{k}}\). We say that a measure \(\mu \) on \({\mathbb {R}}^d\) orthonormalizes a sequence \(\{p_k\}_{k=0}^\infty \subset {\mathcal {P}}_d\), if \(\int _{{\mathbb {R}}^d} p_k {\bar{p}}_l \mathrm d\mu = \delta _{kl}\) for all \(k,l\in {\mathbb {N}}\), where \({\bar{p}}\) is the defined by \({\bar{p}}(x)=\overline{p(x)}\), \(x\in {\mathbb {R}}^d\), \(p\in {\mathcal {P}}_d\).

Let us recall the Favard theorem on orthogonal polynomials in one variable. The following version of this theorem is a direct consequence of Theorem 53 in [2] (cf. [1]).

Theorem 1

If \(\{p_k\}_{k=0}^\infty \) is a sequence of real polynomials in one variable, such that \(p_0 = 1\) and \(\deg p_k=k\) for all \(k\in {\mathbb {N}}\), then the following two conditions are equivalent:

  1. (i)

    there exists a measure \(\mu \) on \({\mathbb {R}}\) which orthonormalizes \(\{p_k\}_{k=0}^\infty \);

  2. (ii)

    for every \(k \in {\mathbb {N}}\), there exist \(a_k \in {\mathbb {R}}\) and \(b_k \in {\mathbb {R}}\), such that

    $$\begin{aligned} X p_k = a_k p_{k+1} + b_k p_k + a_{k-1} p_{k-1}, \text { where } a_{-1} {\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}1 \text { and } p_{-1} {\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}0. \end{aligned}$$

The condition (ii) is customarily called the three-term relation.

When considering a multi-variable version of this theorem, the successful attempts were undertaken by Kowalski [5] and then further developed by Xu [16], yet they formulated their versions only for full polynomial bases of \({\mathcal {P}}_d\). This clearly excludes some interesting measures, e.g., the Lebesgue measure on the unit circle in \({\mathbb {R}}^2\), as in this case, every polynomial is orthogonal to \(x^2+y^2-1\). It was discovered in [2] that this deficiency can be overcome by introducing equality modulo an ideal instead of dealing with ordinary equality in the three-term relation. The aforesaid ideal in the case of the circle would be the set of all polynomials vanishing on the circle.

We now proceed to introduce our main notions required to state the Favard theorem in several variables. Let \(V \subset {\mathcal {P}}_d\) be a proper ideal and \(\Pi _V:{\mathcal {P}}_d \rightarrow {\mathcal {P}}_d/V\) be the canonical embedding. We additionally assume that \({\varvec{V}}\) is a \(*\)-ideal, i.e., \({\overline{p}} \in V\) whenever \(p\in V\). We will say that a set of polynomials is linearly independent over \({\varvec{V}}\) (or \({\varvec{V}}\)-linearly independent) if \(\Pi _V\) maps this set injectively onto a linearly independent subset of \({\mathcal {P}}_d/V\). Let

$$\begin{aligned} d_V(k) {\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}\dim \Pi _V({\mathcal {P}}^k_d) - \dim \Pi _V({\mathcal {P}}^{k-1}_d), \quad k\geqslant 1, \end{aligned}$$

and \(d_V(0){\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}1\). In most cases, \(d_V(k) \geqslant 1\) for all \(k\geqslant 0\), in which case we set \(\varkappa _V = \infty \), and otherwise \(\varkappa _V = \max \{k\geqslant 1 :d_V(k) \ne 0\}\); this definition makes sense, because by formula (8) in [2], \(d_V(k+1) = 0\) provided that \(d_V(k)=0\), \(k\geqslant 1\). Furthermore, a sequence \(\{Q_k\}_{k=0}^{\varkappa _V}\) is called a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_d\) if every \(Q_k\) is a column polynomial of size \(d_V(k)\), that is

$$\begin{aligned} Q_k = \begin{bmatrix}q_1^{(k)} \\ \vdots \\ q_{d_V(k)}^{(k)}\end{bmatrix},\quad q^{(k)}_j \in {\mathcal {P}}_d\ (j=1,\ldots , d_V(k)), \end{aligned}$$
(1)

all polynomials in \(Q_k\) are of degree \({\varvec{k}}\) and the set

$$\begin{aligned} \left\{ q^{(k)}_{j_k}+V:k\in {\mathbb {N}},\ k\leqslant \varkappa _V,\ j_k=1,\ldots , d_V(k)\right\} \end{aligned}$$

is a basis of \({\mathcal {P}}_d/V\). As shown in [2], such bases always exist, and moreover, they can be build from the monomials. If \(p,q \in {\mathcal {P}}_d\), then the notation \(p{\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}q\) means that \(p+V=q+V\) (or equivalently \(p-q \in V\)). If \({\varvec{P}}\) and \({\varvec{Q}}\) are column polynomials, then we write \(P{\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}Q\) if the columns are of the same size and all the entries of \(P-Q\) are in \({\varvec{V}}\). If \(\{Q_k\}_{k=0}^{\varkappa _V}\) is a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_d\), \(p\in {\mathcal {P}}_d\) and \(\deg p=k \leqslant \varkappa _V\), then there are unique scalar row vectors \(C_0,\ldots ,C_k\), such that \(p {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}\sum _{j=0}^k C_j Q_j\) (this remains true if \(k>\varkappa _V\) provided that the summation is done over \(j=0,\ldots ,\varkappa _V\)).

Given a linear functional \(L:{\mathcal {P}}_d\rightarrow {\mathbb {C}}\), we write

$$\begin{aligned} L([p_{k,l}]_{k=0}^m{}_{l=0}^n) = [L(p_{k,l})]_{k=0}^m{}_{l=0}^n, \end{aligned}$$

where \(p_{k,l} \in {\mathcal {P}}_d\). This way, we can make sense of \(L(PQ^*)\), where \({\varvec{P}}\) and \({\varvec{Q}}\) are column polynomials (not necessarily of the same size), and

$$\begin{aligned} Q^* = \begin{bmatrix}{\bar{q}}_1&\ldots&{\bar{q}}_m\end{bmatrix} \quad \text {if} \quad Q= \begin{bmatrix} q_1 \\ \vdots \\ q_m \end{bmatrix}. \end{aligned}$$

Finally, we say that L orthonormalizes a rigid \({\varvec{V}}\)-basis \(\{Q_k\}_{k=0}^{\varkappa _V}\) of real polynomials,Footnote 1 if \(L(Q_kQ_l^{\mathop \intercal }) = 0\) for all \(k,l \in {\mathbb {N}}\), \(k,l\leqslant \varkappa _V \), \(k\ne l\), and \(L(Q_kQ_k^{\mathop \intercal }) = I\), \(k\in {\mathbb {N}}\), \(k\leqslant \varkappa _V\), where \({\varvec{I}}\) stands for the identity matrix of appropriate size (which in these settings is equal to \(d_V(k)\)). A linear functional \(L:{\mathcal {P}}_d \rightarrow {\mathbb {C}}\) is called positive definite if \(L(p\overline{p})\geqslant 0\) for all \(p\in {\mathcal {P}}_d\).

We are now in a position to state our generalization of the Favard theorem which is in flavor of [2].

Theorem 2

Let \(V\subset {\mathcal {P}}_d\) be a proper \(*\)-ideal and \(\{Q_k\}_{k=0}^{\varkappa _V}\) be a rigid \({\varvec{V}}\)-basis of real polynomials with \(Q_0=1\). Then, the following conditions are equivalent:

  1. (A)

    there exists positive definite \(L:{\mathcal {P}}_d\rightarrow {\mathbb {C}}\) which orthonormalizes \(\{Q_k\}_{k=0}^\infty \), such that \(V\subset \ker L\);

  2. (B)

    for every \(j=1,\ldots ,d\), there exist systems of scalar matrices \(\{A_{k,j}\}_{k=0}^\infty \) and \(\{B_{k,j}\}_{k=0}^\infty \) of appropriate size, such that

    $$\begin{aligned} X_jQ_k {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}A_{k,j}Q_{k+1}+ B_{k,j}Q_k + A_{k-1,j}^* Q_{k-1}, \end{aligned}$$
    (2)

    for all \(k\in {\mathbb {N}}\), \(k\leqslant \varkappa _V\), and \(j=1,\ldots ,d\), where \(A_{-1,j}=1\) and \(Q_{-1}=0\); if \(\varkappa _V < \infty \), then \(A_{\varkappa _V,j}\) is a column \([1, \ldots , 1]^{\mathop \intercal }\) with \(d_V(\varkappa _V)\) entries and \(Q_{\varkappa _V+1} = 0\).

Proof

This is a specified version of Theorem 36 in [2], if conditions (A) and (B) therein are simplified by our assumptions that \(\{Q_k\}_{k=0}^{\varkappa _V}\) is a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_d\); in particular, the lengthy condition (B) therein reduces to (B-i) only.

\(\square \)

Remark 1

As follows from Theorem 36 in [2], the matrix \(\begin{bmatrix}A_{k,1}^*&\ldots&A_{k,d}^* \end{bmatrix}^*\) must necessarily be injective for all \(k\geqslant 0\). Even more, all the matrices \(A_{k,j}\) and \(B_{k,j}\) must be real. To see this, one can take adjointsFootnote 2 of both sides in (2) and then transpose them to deduce that (2) holds with \((A_{k,j}^*)^{\mathop \intercal }\) and \((B_{k,j}^*)^{\mathop \intercal }\) in place of \(A_{k,j}\) and \(B_{k,j}\), respectively, which employs the assumption that all \(Q_k\) are real column polynomials. Since \(\{Q_k\}_{k0}^{\varkappa _V}\) is a rigid V-basis, the matrices in (2) are unique, and thus, \((A_{k,j}^*)^{\mathop \intercal }= A_{k,j}\) and \((B_{k,j}^*)^{\mathop \intercal }= B_{k,j}\), which proves our claim. In particular, (2) can be written with \(A_{k-1,j}^{\mathop \intercal }\) in place of \(A_{k-1,j}^*\). Actually, all the matrices \(B_{k,j}\) are symmetric. Indeed, if we multiply both sides of (2) by \(Q_k^{\mathop \intercal }\) from the left, and then apply \({\varvec{L}}\) appearing in Theorem 2 to the equation, after omitting the zero terms and noticing that \(L(B_{k,j}Q_kQ_k^{\mathop \intercal }) = B_{k,j} L(Q_kQ_k^{\mathop \intercal })\), we get

$$\begin{aligned} L(X_jQ_kQ_k^{\mathop \intercal }) = B_{k,j}, \quad k\in {\mathbb {N}}, j \in \{0,\ldots ,d\}. \end{aligned}$$

It now suffices to justify the symmetry of the left-hand side:

$$\begin{aligned} L(X_jQ_kQ_k^{\mathop \intercal })^{\mathop \intercal }= L \big ( (X_jQ_kQ_k^{\mathop \intercal })^{\mathop \intercal }\big ) = L \big ( X_j (Q_kQ_k^{\mathop \intercal })^{\mathop \intercal }\big ) = L( X_jQ_kQ_k^{\mathop \intercal }). \end{aligned}$$

Note that this idea provides another way of showing that the matrices \(A_{k,j}\) and \(B_{k,j}\) are real.

As is easily seen the case when \(\varkappa _V<\infty \) (in other terms \(\dim {\mathcal {P}}_d/V < \infty \)) corresponds to a finite number of linearly independent polynomials over \({\varvec{V}}\), thus, in this case, we can have only a finite number of orthogonal polynomials whenever \(V\subset \ker L\). Indeed, one can readily verify that under the latter assumption, any set \(C\subset {\mathcal {P}}_d\) such that \(L(p{\bar{q}}) = \delta _{pq}\), \(p,q\in C\), must be \({\varvec{V}}\)-linearly independent, and thus, its cardinality is no greater than \(\dim {\mathcal {P}}_d/V\).

In the single variable case, Theorem 2 leads to the Favard theorem (Theorem 1) due to the well-known fact that every positive definite linear functional on \({\mathcal {P}}_1\) can be represented as an integral with respect to a nonnegative Borel measure on \({\mathbb {R}}\) (see the discussion concerning the statement (50) in [2], see also [1, 10]). Disappointingly, albeit challengingly so, the positive definiteness does not guarantee the existence of such a representing measure in the several variable case (again see the comment on (50) in [2]). A substantial part of [2] is devoted to conditions ensuring existence of representing measures in the multi-variable case.

2 Christoffel–Darboux Kernel in the Case of Real Variables

Let \(V\subset {\mathcal {P}}_d\) be a proper \(*\)-ideal and let \(\{Q_k\}_{k=0}^{\varkappa _V}\) be a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_d\). For every \(n\in {\mathbb {N}}\), \(n\leqslant \varkappa _V\), we define the associated Christoffel–Darboux kernel by the formula

$$\begin{aligned} K_n(x,y){\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}\sum _{k=0}^n Q_k^*(y)Q_k(x), \quad x,y\in {\mathbb {R}}^d. \end{aligned}$$
(3)

We use the shorthand notation \(\bigcup _{k=0}^n Q_k\) for the set of all entries of all column polynomials \(Q_k\), \(k=0,\ldots ,n\). It is worth noticing that each \(K_n\) enjoys the reproducing property in the following sense (cf. formula (1.2.37) in [11]).

Proposition 3

If \(\{Q_k\}_{k=0}^{\varkappa _V}\) is a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_d\) which is orthonormal with respect to an inner product \(\langle \cdot ,{-}\rangle \) in \({\mathcal {P}}_d\), and \(X_n {\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}\textrm{lin}\bigcup _{k=0}^n Q_k\) for \(n\in {\mathbb {N}}\), \(n\leqslant \varkappa _V\), then \(K_n\) is a reproducing kernel for \(X_n\), which means that, for every \(p\in X_n\), we have the equation

$$\begin{aligned} p(y) = \langle p,K_n(\cdot ,y)\rangle ,\quad y\in {\mathbb {R}}^d. \end{aligned}$$

Proof

Let us write \(\bigcup _{k=0}^n Q_k = \{q_0,\ldots ,q_m\}\). The polynomial \(p\in X_n\) may be written uniquely as \(p=\sum _{k=0}^m a_k q_k\) with some \(a_k \in {\mathbb {C}}\). Similarly, \(K_n(x,y) = \sum _{k=0}^m \overline{q_k(y)} q_k(x)\), \(x,y\in {\mathbb {R}}^d\). Then

$$\begin{aligned} \langle p,K_n(\cdot ,y)\rangle = \left\langle \sum _{k=0}^m a_k q_k, \sum _{k=0}^m \overline{q_k(y)} q_k\right\rangle = \sum _{k=0}^m a_k q_k(y) = p(y), \end{aligned}$$

which is the desired conclusion. \(\square \)

If \(H_1,H_2\) are linear subspaces of \({\mathcal {P}}_d\), then we identify the algebraic tensor product \(H_1\otimes H_2\) with the linear space spanned by all polynomials of the form \(h_1\otimes h_2 \in {\mathcal {P}}_{2d}\), where

$$\begin{aligned} (h_1\otimes h_2)(x,y) = h_1(x)h_2(y),\quad x,y\in {\mathbb {R}}^d. \end{aligned}$$

We may now formulate a multi-variable version of the Christoffel–Darboux formula. If \(V=\{0\}\), then it coincides with Theorem 3.6.3 in [3].

Theorem 4

Let \(V\subset {\mathcal {P}}_d\) be a proper \(*\)-ideal, and \(\{Q_k\}_{k=0}^{\varkappa _V}\) be a rigid \({\varvec{V}}\)-basis of real polynomials with \(Q_0=1\). Assume that \(\{Q_k\}_{k=0}^{\varkappa _V}\) obeys the three-term relation (B) in Theorem 2. Then, \((x_j-y_j)K_n(x,y)\) is equal to

$$\begin{aligned}{}[A_{n,j}Q_{n+1}(x)]^{\mathop \intercal }Q_n(y) - Q_n(x)^{\mathop \intercal }[A_{n,j}Q_{n+1}(y)] \end{aligned}$$
(4)

modulo the ideal \(V_2 {\mathop {=}\limits ^{\scriptscriptstyle {\textsf{def}}}}V\otimes {\mathcal {P}}_d+{\mathcal {P}}_d\otimes V \subset {\mathcal {P}}_{2d}\) for all \(n\in {\mathbb {N}}\), \(n\leqslant \varkappa _V\), and \(j=1,\ldots ,d\).

Proof

Fix \(j\in \{1,\ldots , d\}\) and \(n\in {\mathbb {N}}\), \(n\leqslant \varkappa _V\). Let the expression (4) be denoted by \({{\tilde{K}}}_n(x,y)\). Applying (B) supported by Remark 1 for a fixed \(s\in \{0,\ldots , n\}\), we get

$$\begin{aligned} {{\tilde{K}}}_s(x,y)&= [A_{s,j} Q_{s+1}(x)]^{\mathop \intercal }Q_s(y) - Q_s(x)^{\mathop \intercal }[A_{s,j} Q_{s+1}(y)]\\&= [x_jQ_s(x)- B_{s,j}Q_s(x) - A_{s-1,j}^{\mathop \intercal }Q_{s-1}(x)]^{\mathop \intercal }Q_s(y)\\&\quad - Q_s(x)^{\mathop \intercal }[y_jQ_s(y)- B_{s,j}Q_s(y) - A_{s-1,j}^{\mathop \intercal }Q_{s-1}(y)]\\&\quad + r_s(x)^{\mathop \intercal }Q_s(y) + Q_s(x)^{\mathop \intercal }{\tilde{r}}_s(y), \end{aligned}$$

where \(r_s\) and \({\tilde{r}}_s\) are column polynomials equal to 0 modulo \({\varvec{V}}\). Remembering that \(B_{k,j}\) are real and symmetric (see Remark 1), we get

$$\begin{aligned} {\tilde{K}}_s(x,y)&= (x_j-y_j) Q_s(x)^{\mathop \intercal }Q_s(y)\\&\quad - [A_{s-1,j}^{\mathop \intercal }Q_{s-1}(x)]^{\mathop \intercal }Q_s(y) + Q_s(x)^{\mathop \intercal }[A_{s-1,j}Q_{s-1}(y)]\\&\quad +r_s(x)^{\mathop \intercal }Q_s(y) + Q_s(x)^{\mathop \intercal }{\tilde{r}}_s(y)\\&= (x_j-y_j) Q_s(x)^{\mathop \intercal }Q_s(y) + {\tilde{K}}_{s-1}(x,y) +r_s(x)^{\mathop \intercal }Q_s(y) + Q_s(x)^{\mathop \intercal }{\tilde{r}}_s(y). \end{aligned}$$

Technically, this procedure works only for \(s\geqslant 1\), but it is easy to see that the term \({\tilde{K}}_{s-1}\) for \(s=0\) is equal to zero, since \(Q_{-1}=0\). Taking summation over \(s=0,\ldots ,n\) we complete the proof. \(\square \)

The assertion of Theorem 4 can be written briefly as

$$\begin{aligned} (x_j-y_j) K_n(x,y) {\mathop {=}\limits ^{{\scriptscriptstyle {{ V _2}}}}}[A_{n,j}Q_{n+1}(x)]^{\mathop \intercal }Q_n(y) - Q_n(x)^{\mathop \intercal }[A_{n,j}Q_{n+1}(y)]. \end{aligned}$$
(5)

Note that in case of \(V=\{0\}\), the above equation coincides with the ordinary one. We say that \(K_n\) defined by (3) satisfies the jth Christoffel–Darboux formula if (5) holds true with some scalar matrix \(A_{n,j}\).

Lemma 5

Let \({\varvec{X}}\) and \({\varvec{Y}}\) be real or complex vector spaces, and \(X_0\) and \(Y_0\) be their linear subspaces, respectively. Then, there is a linear isomorphism

$$\begin{aligned} \varPhi : (X/X_0) \otimes (Y/Y_0) \rightarrow (X \otimes Y)/(X_0\otimes Y + X \otimes Y_0), \end{aligned}$$

such that \(\varPhi ((x+X_0)\otimes (y+Y_0)) = x\otimes y + X_0\otimes Y + X \otimes Y_0\) for all \(x\in X\) and \(y\in Y\).

Proof

The authors are aware that this lemma is commonly regarded as a straightforward consequence of the Yoneda lemma, a well-established result in the category theory. However, we have not been able to find any direct reference in the mathematical literature (apart from fishy websites), and hence, we propose our short proof without recourse to any “abstract nonsense”.

For convenience, we abbreviate \(X_0\otimes Y + X \otimes Y_0\) to \({\textsf{T}}(X_0,Y_0)\). We begin with showing that the mapping

$$\begin{aligned} (X/X_0) \times (Y/Y_0) \ni (x+X_0,y+Y_0) \mapsto x\otimes y + {\textsf{T}}(X_0,Y_0) \in (X \otimes Y)/{\textsf{T}}(X_0,Y_0) \end{aligned}$$

is well defined. Let \(x_1+X_0 = x_2+ X_0\) and \(y_1+Y_0 = y_2 + Y_0\). Then

$$\begin{aligned} x_1\otimes y_1 = x_2\otimes y_2 + (x_1-x_2)\otimes y_2 + x_1 \otimes (y_1-y_2); \end{aligned}$$

thus, \(x_1\otimes y_1\) and \(x_2\otimes y_2\) are equal modulo \({\textsf{T}}(X_0,Y_0)\) and the mapping is well defined. Since it is bilinear, the universal factorization property for tensor products yields the linear mapping

$$\begin{aligned} \varPhi : (X/X_0) \otimes (Y/Y_0) \rightarrow (X \otimes Y)/{\textsf{T}}(X_0,Y_0), \end{aligned}$$

resulting in \(\varPhi ((x+X_0)\otimes (y+Y_0)) = x\otimes y + {\textsf{T}}(X_0,Y_0)\) for all \(x\in X\) and \(y\in Y\).

We now proceed to define a mapping which will promptly turn out to be the inverse of \(\varPhi \). Consider the bilinear mapping

$$\begin{aligned} X\times Y \ni (x,y) \mapsto (x+X_0)\otimes (y+Y_0) \in (X/X_0) \otimes (Y/Y_0). \end{aligned}$$

The bilinearity of this mapping and the universal factorization property lead to the linear mapping

$$\begin{aligned} \varPsi : X\otimes Y \rightarrow (X/X_0) \otimes (Y/Y_0), \end{aligned}$$

such that \(\varPsi (x\otimes y) = (x+X_0) \otimes (y+Y_0)\) for all \(x\in X\) and \(y\in Y\). It is clear that \(\varPsi (x\otimes y) = 0\), whenever \(x\in X_0\) or \(y\in Y_0\). Since the elements of the form \(x\otimes y\) with \(x\in X_0\) or \(y\in Y_0\) generate \({\textsf{T}}(X_0,Y_0)\), it follows that \({\textsf{T}}(X_0,Y_0) \subset \ker \varPsi \). Passing to the quotient spaces, the mapping \(\varPsi \) induces another mapping

$$\begin{aligned} \varPsi _1: (X\otimes Y)/{\textsf{T}}(X_0,Y_0) \rightarrow (X/X_0) \otimes (Y/Y_0) \end{aligned}$$

with the property that \(\varPsi _1 (x\otimes y + {\textsf{T}}(X_0,Y_0)) = (x+X_0) \otimes (y+Y_0)\) for all \(x\in X\) and \(y\in Y\). It turns out that \(\varPhi \circ \varPsi _1\) and \(\varPsi _1 \circ \varPhi \) are identity mappings on \((X\otimes Y)/{\textsf{T}}(X_0,Y_0)\) and \((X/X_0) \otimes (Y/Y_0)\), respectively, which can be verified directly on the generators. \(\square \)

The following observation shows that the ideal \(V_2\) is well chosen regarding tensor products.

Corollary 6

Let \({\varvec{V}}\) be a \(*\)-ideal in \({\mathcal {P}}_d\). There is a unique mapping

$$\begin{aligned} \varPhi : ({\mathcal {P}}_d/V) \otimes ({\mathcal {P}}_d/V) \rightarrow {\mathcal {P}}_{2d}/V_2, \end{aligned}$$

such that \(\varPhi ((p+V) \otimes (q+V)) = p\otimes q+V_2\). Moreover, \(\varPhi \) is a linear isomorphism.

Proof

This is Lemma 5 translated to the case of a \(*\)-ideal treated as a linear subspace of \({\mathcal {P}}_d\). The identification of \({\mathcal {P}}_d \otimes {\mathcal {P}}_d\) with \({\mathcal {P}}_{2d}\) is well known. \(\square \)

Corollary 7

Let \(V\subset {\mathcal {P}}_d\) be a proper \(*\)-ideal and \(\{Q_k\}_{k=0}^{\varkappa _V}\) be a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_d\). If \(P_1\) and \(P_2\) are column polynomials, such that

$$\begin{aligned} P_1(x)^{\mathop \intercal }Q_k(y) {\mathop {=}\limits ^{{\scriptscriptstyle {{ V _2}}}}}Q_j(x)^{\mathop \intercal }P_2(y) \end{aligned}$$
(6)

with some integer \(k,j \geqslant 0\), then there exists a unique scalar matrix \({\varvec{E}}\), such that

$$\begin{aligned} P_1 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}E Q_j \quad \text {and} \quad P_2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}E^{\mathop \intercal }Q_k. \end{aligned}$$

Proof

It is well known that if \(\{e_n\}_{n=0}^\varkappa \) with \(\varkappa \in {\mathbb {N}}\cup \{\infty \}\) is a basis of a vector space \({\varvec{X}}\), then the system \(\{e_m\otimes e_n\}_{m,n=0}^\varkappa \) forms a basis for \(X\otimes X\). Hence, the system \(\{Q_k\}_{k=0}^{\varkappa _V}\) induces the basis of \(({\mathcal {P}}_d/V) \otimes ({\mathcal {P}}_d/V)\) composed of all elements of the form \((p+V) \otimes (q+V)\), where \({\varvec{p}}\) appears in some column \(Q_j\) and \({\varvec{q}}\) does in some column \(Q_k\). By Corollary 6, the related basis of \({\mathcal {P}}_{2d}/V_2\) consists of all elements of the form \(p\otimes q+V_2\) with the same way of choosing polynomials \({\varvec{p}}\) and \({\varvec{q}}\).

By our assumption, \(P_1\) can be expressed as \(P_1 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}\sum _{n=0}^N C_n Q_n\) with unique scalar matrices \(C_n\). Similarly, \(P_2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}\sum _{n=0}^N D_n Q_n\) with unique scalar matrices \(D_n\) (adding some zero terms, we may have the same \({\varvec{N}}\) for both \(P_1\) and \(P_2\)). Substituting this in (6), we get

$$\begin{aligned} \sum _{n=0}^N Q_n^{\mathop \intercal }(x) C_n^{\mathop \intercal }Q_k(y) {\mathop {=}\limits ^{{\scriptscriptstyle {{ V _2}}}}}\sum _{n=0}^N Q_j(x)^{\mathop \intercal }D_n Q_n (y). \end{aligned}$$

Since the left-hand side is a sum of polynomials of the form p(x)q(x) with \({\varvec{q}}\) from the column \(Q_k\), by linear independence modulo \(V_2\), we infer that \(D_n=0\) for all n but \(n=k\). Similarly, all \(C_n=0\) except for \(n=j\). It follows that \(P_1= C_jQ_j\) and \(P_2 = D_k Q_k\). Moreover, Eq. (6) takes the form

$$\begin{aligned} Q_j^{\mathop \intercal }(x) C_j^{\mathop \intercal }Q_k(y) {\mathop {=}\limits ^{{\scriptscriptstyle {{ V _2}}}}}Q_j^{\mathop \intercal }(x) D_k Q_k(y), \end{aligned}$$

which by linear independence forces \(C_j^{\mathop \intercal }= D_k\). The proof is complete. \(\square \)

As in one variable, we would like to emphasize that there is a deeper connection between the Christoffel–Darboux formula and the three-term relation. In fact, the aforesaid formula implies the relation.

Theorem 8

Let \(V\subset {\mathcal {P}}_d\) be a proper \(*\)-ideal and \(\{Q_k\}_{k=0}^{\varkappa _V}\) be a rigid \({\varvec{V}}\)-basis of real polynomials with \(Q_0=1\), and \(j\in \{1,\ldots ,d\}\). Let \(N\in {\mathbb {N}}\), \(N\leqslant \varkappa _V\). Assume that \(K_s\) satisfies the jth Christoffel–Darboux formula (5) for all \(n=0,\ldots ,N\) with some scalar matrices \(A_{n,j}\). Then

$$\begin{aligned} X_jQ_k {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}A_{k,j}Q_{k+1}+ B_{k,j}Q_k + A_{k-1,j}^{\mathop \intercal }Q_{k-1}, \quad k=0,\ldots ,N, \end{aligned}$$

with some scalar matrices \(B_{k,j}\) (with the convention \(Q_{-1}=0\) and \(A_{-1,j}=0)\).

Proof

Let \({\tilde{K}}_n\) stand for the right-hand side of (5), \(n= 0,\ldots ,N\). By our assumption

$$\begin{aligned} {\tilde{K}}_n(x,y) = (x_j-y_j)K_n(x,y) + r_n(x)^{\mathop \intercal }w_n(y) + {\tilde{w}}_n(x)^{\mathop \intercal }{\tilde{r}}_n(y) \end{aligned}$$

with column polynomials \(r_n\), \(w_n\), \({\tilde{r}}_n\), \({\tilde{w}}_n\), such that \(r_n\) and \({\tilde{r}}_n\) are equal to 0 modulo \({\varvec{V}}\). For simplicity, we write \(R_n(x,y) = r_n(x)^{\mathop \intercal }w_n(y) + {\tilde{w}}_n(x)^{\mathop \intercal }{\tilde{r}}_n(y)\). Since

$$\begin{aligned} K_n(x,y) = Q_n(x)^{\mathop \intercal }Q_n(y) + K_{n-1}(x,y), \end{aligned}$$

we have

$$\begin{aligned} {\tilde{K}}_k(x,y) - R_k(x,y) = (x_j-y_j)Q_k(x)^{\mathop \intercal }Q_k(y) + {\tilde{K}}_{k-1}(x,y) - R_{k-1}(x,y) \end{aligned}$$

for \(k=0,\ldots ,N\). This, written explicitly, reads as follows:

$$\begin{aligned}{} & {} [A_{k,j}Q_{k+1}(x)]^{\mathop \intercal }Q_k(y) - Q_k(x)^{\mathop \intercal }[A_{k,j}Q_{k+1}(y)] - R_k(x,y)\\{} & {} \quad = (x_j-y_j) Q_k(x)^{\mathop \intercal }Q_k(y)+ [A_{k-1,j}Q_k(x)]^{\mathop \intercal }Q_{k-1}(y)\\{} & {} \qquad - Q_{k-1}(x)^{\mathop \intercal }[A_{k-1,j}Q_k(y)] - R_{k-1}(x,y), \end{aligned}$$

which, after rearranging terms, leads to

$$\begin{aligned}{} & {} [A_{k,j}Q_{k+1}(x) + A_{k-1,j}^{\mathop \intercal }Q_{k-1}(x) - x_jQ_k(x)]^{\mathop \intercal }Q_k(y) - R_k(x,y)\\{} & {} \quad = Q_k(x)^{\mathop \intercal }[A_{k,j}Q_{k+1}(y) + A_{k-1,j}^{\mathop \intercal }Q_{k-1}(y) - y_jQ_k(y)] - R_{k-1}(x,y). \end{aligned}$$

A direct application of Corollary 7 leads to a matrix \(B_{k,j}\) such that

$$\begin{aligned} A_{k,j}Q_{k+1} + A_{k-1,j}^{\mathop \intercal }Q_{k-1} - x_jQ_k {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}- B_{k,j} Q_k, \end{aligned}$$

which gives the desired conclusion. \(\square \)

3 Examples

One of the examples could be simply rewritten from [14] where the author considered the tensor product of two families of orthogonal polynomials in one variable, more specifically the Krawtchouk polynomials and the Charlier ones. Digesting this example, one may grasp a general background idea that allows to construct what may be called “product polynomials”, i.e., the tensor product of orthogonal polynomials in a single variable. Noticing this, we prefer to direct our attention to two variable polynomials which are not obtained this way.

We now focus on \({\mathcal {P}}_2/V\), where \({\varvec{V}}\) is the ideal of all polynomials vanishing on the unit circle \({\mathbb {T}}\), centered at 0. To establish the three-term relation, we will employ the well-known system \(\{z^n\}_{n\in {\mathbb {Z}}}\), which is orthonormal with respect to the normalized Lebesgue measure \({\varvec{m}}\) on \({\mathbb {T}}\). One can easily check that the system

$$\begin{aligned} \{z^0\} \cup \left\{ \tfrac{1}{\sqrt{2}} (z^n+{\bar{z}}^n):n\geqslant 1 \right\} \cup \left\{ \tfrac{1}{\sqrt{2}\mathrm i}(z^n-{\bar{z}}^n) :n\geqslant 1 \right\} \end{aligned}$$
(7)

is also orthonormal with respect to \({\varvec{m}}\). What is more, every member of \({\mathbb {C}}[z,{\bar{z}}]\) is equal modulo \({\varvec{V}}\) to a linear combination of elements of the system. Indeed, since \(z {\bar{z}} {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}1\), for \(k> l\), we may write

$$\begin{aligned} z^k {\bar{z}}^l {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}z^{k-l} = \tfrac{1}{2} (z^{k-l}+{\bar{z}}^{k-l}) + \tfrac{1}{2} (z^{k-l}-{\bar{z}}^{k-l}), \end{aligned}$$

so we have expressed \(z^k{\bar{z}}^l\) linearly by means of the system (7). The same can be done for \(k<l\), as \(z^k{\bar{z}}^l {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}{\bar{z}}^{l-k}\). Since the case \(k=l\) is trivial, this proves our claim.

These polynomials give rise to the following real polynomials:

$$\begin{aligned}&q^{(0)}_1 (x_1,x_2) =1, \\ {}&\begin{gathered} q^{(k)}_1 (x_1,x_2) = \tfrac{1}{\sqrt{2}} \big ( (x_1+\mathrm ix_2)^k + (x_1-\mathrm ix_2)^k \big ),\\ q^{(k)}_2 (x_1,x_2) = \tfrac{1}{\sqrt{2} \mathrm i} \big ( (x_1+\mathrm ix_2)^k - (x_1-\mathrm ix_2)^k \big ), \end{gathered} \quad k\geqslant 1. \end{aligned}$$

We arrange them in a column form

$$\begin{aligned} Q_0 = q^{(0)}_1, \quad Q_k = \begin{bmatrix} q^{(k)}_1 \\ q^{(k)}_2 \end{bmatrix}, \ k\geqslant 1. \end{aligned}$$
(8)

Let \(L_m:{\mathcal {P}}_2 \rightarrow {\mathbb {C}}\) denote the functional

$$\begin{aligned} L_m(p) = \int _{\mathbb {T}} p(x_1,x_2) \mathrm dm(x_1,x_2), \quad p\in {\mathcal {P}}_2. \end{aligned}$$

By the properties of the system (7) and the identification of \({\mathcal {P}}_2\) with \({\mathbb {C}}[z,{\bar{z}}]\) via \(z=x_1+\mathrm ix_2\), we infer that (8) is a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_2\), such that \(L_m(Q_k Q_l^{\mathop \intercal }) = 0\) if \(k\ne l\), and \(L_m(Q_k Q_k^{\mathop \intercal }) = \big [ {\begin{matrix} 1&{}0\\ 0&{}1 \end{matrix}} \big ]\) if \(k\geqslant 1\). By Theorem 2, the system \(\{Q_k\}_{k=0}^\infty \) satisfies the three-term relation

$$\begin{aligned} X_jQ_k {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}A_{k,j}Q_{k+1}+ B_{k,j}Q_k + A_{k-1,j}^{\mathop \intercal }Q_{k-1}, \quad k\geqslant 0,\ j=1,2. \end{aligned}$$

Following the idea from Remark 1, we may compute matrices \(A_{k,j}\) and \(B_{k,j}\) with the help of the formulas:

$$\begin{aligned} A_{k,j} = L_m(X_jQ_kQ_{k+1}^{\mathop \intercal }), \quad B_{k,j} = L_m(X_j Q_k Q_k^{\mathop \intercal }), \quad k\geqslant 0,\ j=1,2. \end{aligned}$$

Leaving simple though slightly tedious computations to the reader, one may arrive at

$$\begin{aligned}&A_{0,1}= \big [ \tfrac{1}{\sqrt{2}}\ \,\, 0 \big ], \quad A_{0,2}= \big [ 0\ \,\, \tfrac{1}{\sqrt{2}} \big ],\\ {}&A_{k,1} = \begin{bmatrix} \frac{1}{2} &{}\,\, 0 \\ 0 &{} \frac{1}{2} \end{bmatrix}, \quad A_{k,2} = \begin{bmatrix} 0 &{}\,\, \frac{1}{2} \\ -\frac{1}{2} &{}\,\, 0 \end{bmatrix},\quad k\geqslant 1, \end{aligned}$$

and \(B_{k,j} = 0\) for all \(k\geqslant 0\), \(j=1,2\). Theorem 4 is now ready to be applied for writing the Christoffel–Darboux formula

$$\begin{aligned} (x_j-y_j) \sum _{k=0}^n Q_k^{\mathop \intercal }(x) Q_k(y){\mathop {=}\limits ^{{\scriptscriptstyle {{ V _2}}}}}[A_{n,j} Q_{n+1}(x)]^{\mathop \intercal }Q_n(y) - Q_n(x)^{\mathop \intercal }[A_{n,j} Q_{n+1}(y)], \end{aligned}$$

where all the involved objects can be explicitly written.

Let us now turn to a more interesting example of the Bernoulli lemniscate \({\varvec{B}}\), i.e., the set of all points \((x_1,x_2) \in {\mathbb {R}}^2\) satisfying \((x_1^2+x_2^2)^2 = x_1^2-x_2^2\). In the case of polynomials in one complex variable, this case was discussed in [7] (cf. [8]). Our approach seems to be separated from that of [7], not to mention no apparent relation between orthogonality of polynomials in complex and real variables (the precedent case of the circle has just been a lucky coincidence). A parametric description of \({\varvec{B}}\) is given by

$$\begin{aligned}{}[-\tfrac{\pi }{4}, \tfrac{\pi }{4}] \ni t \mapsto \pm \gamma (t) \in {\mathbb {R}}^2, \quad \gamma (t) = \sqrt{\cos 2t} (\cos t, \sin t), \end{aligned}$$

which leads to the formula defining measure \({\mathfrak {m}}\) on B

$$\begin{aligned} L_{\mathfrak {m}}(f) := \int _B f(x_1,x_2) \mathrm d{\mathfrak {m}}(x_1,x_2)&= \alpha _{\mathfrak {m}}\int _{-\frac{\pi }{4}} ^{\frac{\pi }{4}} \big ( f(\gamma (t)) + f(-\gamma (t)) \big ) \Vert \gamma '(t)\Vert \mathrm dt \\ {}&= \alpha _{\mathfrak {m}}\int _{-\frac{\pi }{4}} ^{\frac{\pi }{4}} \big ( f(\gamma (t)) + f(-\gamma (t)) \big ) \frac{\mathrm dt}{\sqrt{\cos 2t}}, \end{aligned}$$

where f is a Borel function bounded on B and \(\alpha _{\mathfrak {m}}>0\) is chosen so that \({\mathfrak {m}}\) is normalized by \(\int _B 1\mathrm d{\mathfrak {m}}=1\). It follows that \(L_{\mathfrak {m}}(f) = 0\) whenever:

  1. (a)

    \(f(-x_1,-x_2)= -f(x_1,x_2)\) for all \((x_1,x_2)\in B\), or

  2. (b)

    \(f(-x_1,x_2)= -f(x_1,x_2)\) for all \((x_1,x_2)\in B\), or

  3. (c)

    \(f(x_1,-x_2)= -f(x_1,x_2)\) for all \((x_1,x_2)\in B\).

Indeed, (a) is a direct consequence of the formula for \({\mathfrak {m}}\), while (b) and (c) result from

$$\begin{aligned} \int _{-\frac{\pi }{4}} ^{\frac{\pi }{4}} f(\pm \gamma (t)) \frac{\mathrm dt}{\sqrt{\cos 2t}} = \int _{-\frac{\pi }{4}} ^{\frac{\pi }{4}} f(\pm \gamma (-t)) \frac{\mathrm dt}{\sqrt{\cos 2t}}. \end{aligned}$$

The ideal \({\varvec{V}}\) related to \({\varvec{B}}\) consists of all polynomials \(p\in {\mathcal {P}}_2\) vanishing on \({\varvec{B}}\), or, equivalently, satisfying \(L_{\mathfrak {m}}(|p|^2)=0\). In particular, \(X_1^2 - X_2^2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}(X_1^2 + X_2^2)^2\). This yields

$$\begin{aligned} X_1^2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}\tfrac{1}{2} (X_1^2+X_2^2 + (X_1^2 + X_2^2)^2) \quad \text {and} \quad X_2^2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}\tfrac{1}{2} (X_1^2+X_2^2 - (X_1^2 + X_2^2)^2),\nonumber \\ \end{aligned}$$
(9)

which means that \(X_1^{2j} X_2^{2k} {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}q(X_1^2+X_2^2)\) with some one-variable polynomial \({\varvec{q}}\) depending on nonnegative integers \({\varvec{j}}\) and \({\varvec{k}}\). As a consequence, if \(p \in {\mathcal {P}}_2\), then there exist \(q_j \in {\mathcal {P}}_1\), \(j=1,2,3,4\), such that

$$\begin{aligned} p {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}q_1(X_1^2+X_2^2) + X_1 q_2(X_1^2+X_2^2) + X_2 q_3(X_1^2+X_2^2) + X_1X_2 q_4(X_1^2+X_2^2). \end{aligned}$$

The crucial remark to make here is that every \(p\in {\mathcal {P}}_2\) is equal modulo \({\varvec{V}}\) to a linear combination of polynomials \(p^{j,k}_l:= X_1^j X_2^k (X_1^2 + X_2^2)^l\) with integers \(j,k=0,1\) and \(l\geqslant 0\). It follows that if \(j_1,j_2,k_1,k_2 =0,1\), \((j_1,k_1) \ne (j_2,k_2)\) and \(q_1,q_2\in {\mathcal {P}}_1\), then

$$\begin{aligned} X_1^{j_1}X_2^{k_1} q_1(X_1^2+X_2^2) \cdot X_1^{j_2}X_2^{k_2} q_2(X_1^2+X_2^2) {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^r X_2^s q(X_1^2+X_2^2), \end{aligned}$$
(10)

with some \(q \in {\mathcal {P}}_1\) and \(r,s =0,1\), such that \(r+s>0\). Since the right-hand side of (10) is a function satisfying one of the properties (a), (b) and (c) listed on p. 12, we infer that

$$\begin{aligned} L_{\mathfrak {m}}\big ( X_1^{j_1}X_2^{k_1} q_1(X_1^2+X_2^2) \cdot X_1^{j_2}X_2^{k_2} q_2(X_1^2+X_2^2) \big ) = 0 \end{aligned}$$

with all \(j_1,j_2,k_1,k_2,q_1,q_2\) already chosen. In other words, polynomials

$$\begin{aligned} X_1^{j_1}X_2^{k_1} q_1(X_1^2+X_2^2) \quad \text {and} \quad X_1^{j_2}X_2^{k_2} q_2(X_1^2+X_2^2) \end{aligned}$$

are \(L_{\mathfrak {m}}\)-orthogonal. In particular, polynomials \(p^{j_1,k_1}_{l_1}\) and \(p^{j_2,k_2}_{l_2}\) are \(L_{\mathfrak {m}}\)-orthogonal, provided that \(j_1,j_2,k_1,k_2\) are as above and \(l_1,l_2 \geqslant 0\).

Fix any \(j,k=0,1\) and consider the linear functional

$$\begin{aligned} L^{j,k}: {\mathcal {P}}_1 \rightarrow {\mathbb {C}}, \quad L^{j,k}(q) = L_{\mathfrak {m}}(X_1^{2j} X_2^{2k} q(X_1^2+X_2^2)), \ q\in {\mathcal {P}}_1. \end{aligned}$$

It is evident that \(L^{j,k}(|q|^2) >0\) for any \(q \ne 0\), and thus, there exists a sequence of real orthogonal polynomials \(\{q_l^{j,k}\}_{l=0}^\infty \), such that \(\deg q_l^{j,k} = l\), \(l \geqslant 0\), and

$$\begin{aligned} L^{j,k}(q_{l_1}^{j,k} q_{l_2}^{j,k}) = \delta _{l_1l_2}. \end{aligned}$$

As usual for orthogonal polynomials in one variable, the sequence is determined uniquely up to the unimodular factor of each \(q_l^{j,k}\). It is a matter of routine to verify that the family of all polynomials

$$\begin{aligned} W^{j,k}_l:= X_1^j X_2^k q_l^{j,k}(X_1^2 + X_2^2), \quad j,k=0,1, \ l\geqslant 0 \end{aligned}$$
(11)

is \(L_{\mathfrak {m}}\)-orthogonal, and even \(L_{\mathfrak {m}}\)-orthonormal. We now arrange this family in the column form

$$\begin{aligned} Q_0= & {} 1, \ Q_1 = \begin{bmatrix} W^{1,0}_0 \\ W^{0,1}_0\end{bmatrix}, \ Q_2 = \begin{bmatrix} W^{0,0}_1 \\ W^{0,0}_2 \\ W^{1,1}_0 \end{bmatrix}, \ Q_{2s-1} = \begin{bmatrix} W^{1,0}_{2s-3} \\ W^{1,0}_{2s-2} \\ W^{0,1}_{2s-3} \\ W^{0,1}_{2s-2}\end{bmatrix},\\ \ Q_{2s}= & {} \begin{bmatrix} W^{0,0}_{2s-1} \\ W^{0,0}_{2s} \\ W^{1,1}_{2s-3} \\ W^{1,1}_{2s-2}\end{bmatrix},\ s\geqslant 2. \end{aligned}$$

At first glance, this does not look like a good idea leading to a rigid \({\varvec{V}}\)-basis of \({\mathcal {P}}_2\), because some coordinate polynomials in \(Q_n\) are not of degree \({\varvec{n}}\) if \(n\geqslant 2\). It turns out that for all \(n\geqslant 0\), every coordinate polynomial of \(Q_n\) is equal modulo \({\varvec{V}}\) to a polynomial of degree exactly \({\varvec{n}}\). This follows from the equationFootnote 3:

$$\begin{aligned} \Pi _V ({\mathcal {P}}^k_2) = {{\,\textrm{lin}\,}}\Pi _V \left( \bigcup _{n=0}^k Q_n \right) \quad \text {for every } k\geqslant 0, \end{aligned}$$
(12)

which we are now going to prove by induction. The instances of \(k=0\) and \(k=1\) are trivial, while for \(k=2\), we have \(\deg W^{0,0}_1 = \deg W^{1,1}_0 = 2\) and \(W^{0,0}_2\) is equal modulo \({\varvec{V}}\) to a polynomial of degree 2, since \((X_1^2+X_2^2)^2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^2-X_2^2\). Let us now assume that \(k\geqslant 3\). To prove the inclusion ,,\(\subset \)” fix a monomial \(X_1^\alpha X_2^\beta \), such that \(\alpha + \beta \leqslant k\). By induction hypothesis, we may focus only on the case when \(\alpha + \beta =k\). Let \([m]_2 = 0\) if integer \({\varvec{m}}\) is even, and \([m]_2=1\) if \({\varvec{m}}\) is odd. Since

$$\begin{aligned} X_1^\alpha X_2^\beta = X_1^{[\alpha ]_2} X_2^{[\beta ]_2} X_1^{\alpha - [\alpha ]_2}X_2^{\beta - [\beta ]_2}, \end{aligned}$$

in virtue of (9), we get

$$\begin{aligned} X_1^\alpha X_2^\beta {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^{[\alpha ]_2} X_2^{[\beta ]_2} p_{\alpha ,\beta } (X_1^2+X_2^2), \end{aligned}$$

with a polynomial \(p_{\alpha ,\beta }\), such that \(\deg p_{\alpha ,\beta } = \alpha +\beta - [\alpha ]_2 - [\beta ]_2\). Under the notation \(\mu = [\alpha ]_2\) and \(\nu = [\beta ]_2\) expanding \(p_{\alpha ,\beta }\) in the basis \(\{q_l^{\mu ,\nu }\}_{l=0}^\infty \), we obtain

$$\begin{aligned} X_1^\alpha X_2^\beta {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^\mu X_2^\nu \sum _{l=0}^{k-\mu -\nu } a_l q_l^{\mu ,\nu }(X_1^2+X_2^2) = \sum _{l=0}^{k-\mu -\nu } a_l W_l^{\mu ,\nu } \end{aligned}$$

with some coefficients \(a_l\in {\mathbb {C}}\). Considering all four possible cases of \(\mu ,\nu =0,1\), we now show that every \(W_l^{\mu ,\nu }\) in the above sum is a coordinate in one of \(Q_n\), \(n=0,\ldots ,k\). Indeed, if \(k=2\,s-1\), then \((\mu ,\nu )\) is equal to (1, 0) or (0, 1), so the resulting polynomials \(W^{\mu ,\nu }_l\) with \(l=0,\ldots , 2s-2\) are coordinate polynomials in one of \(Q_1,Q_3,\ldots ,Q_{2s-1}\). In turn, if \(k=2\,s\), then \((\mu ,\nu )\) is equal to (0, 0) or (1, 1). In the case when \((\mu ,\nu )=(0,0)\), we obtain \(W^{0,0}_l\) with \(l=0,\ldots ,2s\) which appear in the columns \(Q_0, Q_2, \ldots ,Q_{2s}\). The remaining case \((\mu ,\nu )=(1,1)\) can be done in a similar way. This proves the desired inclusion. To verify the reverse inclusion fix \(\mu ,\nu =0,1\) and take any coordinate polynomial \(W_l^{\mu ,\nu }\) in \(Q_k\). By induction hypothesis, it suffices to show that \(W_l^{\mu ,\nu } +V \in \Pi _V({\mathcal {P}}_2^k)\). The polynomial \(q_l^{\mu ,\nu }\) can be written as \(q_l^{\mu ,\nu } = \sum _{j=0}^l b_j X^j\) with some coefficients \(b_l\in {\mathbb {C}}\), and therefore

$$\begin{aligned} W_l^{\mu ,\nu } = X_1^\mu X_2^\nu q_l^{\mu ,\nu } (X_1^2+X_2^2) = \sum _{j=0}^l b_j X_1^\mu X_2^\nu (X_1^2+X_2^2)^j. \end{aligned}$$
(13)

Since \(W_l^{\mu ,\nu }\) is a coordinate of \(Q_k\), we see that \(\mu +\nu +l \leqslant k\). By (13), it suffices to show that

$$\begin{aligned} X_1^\mu X_2^\nu (X_1^2+X_2^2)^j + V \in \Pi _V({\mathcal {P}}_2^k), \quad \mu +\nu +j \leqslant k. \end{aligned}$$

As \((X_1^2+X_2^2)^2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^2-X_2^2\), for an even number \({\varvec{j}}\), we get

$$\begin{aligned} X_1^\mu X_2^\nu (X_1^2+X_2^2)^j {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^\mu X_2^\nu (X_1^2-X_2^2)^{j/2}, \end{aligned}$$

and the latter polynomial belongs to \({\mathcal {P}}_2^k\). If, in turn, \({\varvec{j}}\) is odd, we have

$$\begin{aligned} X_1^\mu X_2^\nu (X_1^2+X_2^2)^j{} & {} = X_1^\mu X_2^\nu (X_1^2+X_2^2) (X_1^2+X_2^2)^{j-1}\nonumber \\{} & {} {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^\mu X_2^\nu (X_1^2+X_2^2) (X_1^2-X_2^2)^{(j-1)/2}, \end{aligned}$$
(14)

where we are led to polynomial of degree \(\mu +\nu + j + 1\), seemingly exceeding \({\varvec{k}}\). However, if \(j<l\), then the last polynomial in (14) is of degree less than or equal to k. In the remaining case, when \(j=l\) is odd one, may notice that whenever \(W^{\mu ,\nu }_l\) appears in \(Q_k\), then \(\mu +\nu +l<k\), and the resulting polynomial in (14) is again a member of \({\mathcal {P}}_2^k\). We have thus proved (12).

Since all \(W_l^{j,k}\) are \(L_{\mathfrak {m}}\)-orthonormal, we deduce that they are linearly independent modulo \({\varvec{V}}\), so by (12), the elements \(W_l^{j,k}+V\) form a basis of \({\mathcal {P}}_2/V\). It follows that the system \(\{Q_n\}_{n=0}^\infty \) meets all the requirements to be a rigid basis except for the condition on the degree of coordinate polynomials of \(Q_n\). However, this can be easily overcome by the above discussion where we have shown how the coordinate polynomials can be replaced by polynomials of the proper degree preserving equality modulo \({\varvec{V}}\). Since our goal is to write the three-term relation as in Theorem 2, which is equation modulo \({\varvec{V}}\), that is

$$\begin{aligned} X_jQ_k {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}A_{k,j}Q_{k+1}+ B_{k,j}Q_k + A_{k-1,j}^{\mathop \intercal }Q_{k-1},\quad k\in {\mathbb {N}},\ j=1,2, \end{aligned}$$

we do not really have to bother about the degree of polynomials in \(Q_n\). As we already know

$$\begin{aligned} A_{k,j} = L_{\mathfrak {m}}(X_jQ_kQ_{k+1}^{\mathop \intercal }) \quad \text {and} \quad B_{k,j} = L_{\mathfrak {m}}(X_jQ_kQ_k^{\mathop \intercal }). \end{aligned}$$

A typical entry in \(A_{k,j}\) or \(B_{k,j}\) takes the form \(L_{\mathfrak {m}}(X_j W_{l_1}^{\mu _1,\nu _1} W_{l_2}^{\mu _2,\nu _2})\) which is equal to zero but the following two cases:

  1. (i)

    \(j=1\), \(\mu _1+\mu _2=1\), \(\nu _1=\nu _2\), or

  2. (ii)

    \(j=2\), \(\mu _1=\mu _2\), \(\nu _1 + \nu _2 =1\),

which can be easily justified with the help of properties (a), (b) and (c) on p. 12. Since all the polynomials \(W^{\mu ,\nu }_l\) in the column \(Q_k\) satisfy either \(\mu +\nu =1\) (if \({\varvec{k}}\) is odd) or \(\mu +\nu \in \{0,2\}\) (if \({\varvec{k}}\) is even), none of the entries of the matrix \(X_jQ_kQ_k^{\mathop \intercal }\) satisfies (i) or (ii), and hence, \(B_{k,j} =0\) for all admissible \({\varvec{k}}\) and \({\varvec{j}}\). We now consider \(A_{k,j}\). Fix \(s\geqslant 2\) and introduce the auxiliary column polynomial \(R_l^{\mu ,\nu } = [q_{l-1}^{\mu ,\nu } \ q_l^{\mu ,\nu } ]^{\mathop \intercal }\), \(l \geqslant 1\), \(\mu ,\nu =0,1\). Thus, we may write

$$\begin{aligned} Q_{2s-1} = \begin{bmatrix} X_1R^{1,0}_{2s-2}( X_1^2 + X_2^2) \\ X_2 R^{0,1}_{2s-2} ( X_1^2 + X_2^2)\end{bmatrix} \quad \text {and} \quad Q_{2s} = \left[ \begin{array}{ll} R^{0,0}_{2s}( X_1^2 + X_2^2) \\ X_1X_2 R^{1,1}_{2s-2} ( X_1^2 + X_2^2)\end{array}\right] , \ s\geqslant 2, \end{aligned}$$

where \(R^{\mu ,\nu }_l ( X_1^2 + X_2^2)\) means substituting \(X_1^2+X_2^2\) in the both coordinate polynomials of \(R^{\mu ,\nu }_l\). In virtue of (i) and (ii), we get

$$\begin{aligned} A_{2s,1}&= \left[ \begin{array}{ll} L^{1,0} (R^{0,0}_{2s} (R^{1,0}_{2s})^{\mathop \intercal }) &{} 0\\ 0 &{} L^{1,1}(R^{1,1}_{2s-2} (R^{0,1}_{2s})^{\mathop \intercal }) \end{array}\right] , \\ A_{2s-1,1}&= \left[ \begin{array}{ll} L^{1,0} (R^{1,0}_{2s-2} (R^{0,0}_{2s})^{\mathop \intercal }) &{} 0\\ 0 &{} L^{1,1}(R^{0,1}_{2s-2} (R^{1,1}_{2s-2})^{\mathop \intercal }) \end{array}\right] , \\ A_{2s,2}&= \left[ \begin{array}{ll} 0 &{} L^{0,1} (R^{0,0}_{2s} (R^{0,1}_{2s})^{\mathop \intercal })\\ L^{1,1}(R^{1,1}_{2s-2} (R^{1,0}_{2s})^{\mathop \intercal }) &{} 0 \end{array}\right] , \\ A_{2s-1,2}&= \left[ \begin{array}{ll} 0 &{} L^{1,1} (R^{1,0}_{2s-2} (R^{1,1}_{2s-2})^{\mathop \intercal })\\ L^{0,1}(R^{0,1}_{2s-2} (R^{0,0}_{2s})^{\mathop \intercal }) &{} 0 \end{array}\right] . \end{aligned}$$

We encourage the reader to derive similar formulas for \(A_{k,j}\) with \(k=0,1\) and \(j=0,1\), the cases not covered above. If we now employed a one-variable orthonormalization procedure, we could consecutively compute all the matrices \(A_{k,j}\).

It is worth noting that one may derive the integral representation for all the functionals \(L^{j,k}\), \(j,k=0,1\), that is

$$\begin{aligned} L^{j,k}(q) = \alpha _{\mathfrak {m}}\int _0^1 q(x) \frac{2^{1-j-k} x^{j+k} (1+x)^j (1-x)^k}{\sqrt{x(1-x^2)}} \mathrm dx, \quad q\in {\mathcal {P}}_1. \end{aligned}$$

The weight function for \(L^{0,0}\) can be related to the weight mentioned in [15, p. 37] [formula (2.9.1)]; to see this, it is enough to perform change of variable \(t=1-x\) under the integral. In turn, the orthogonal polynomials associated with \(L^{j,k}\) with \(j,k=0,1\), \(j+k>0\), can be derived from those of \(L^{0,0}\) by Theorem 2.5 in [15].

The case considered in [7] concerns complex orthogonality of polynomials in a single complex variable with respect to an arbitrary admissible functional \({\varvec{L}}\) (let the term “admissible” remains mysterious for the time being). This allows us to introduce this case in a little bit sketchy way. We will show that the system

$$\begin{aligned} \Sigma _0 = 1, \quad \Sigma _1 = \begin{bmatrix}X_1 \\ X_2\end{bmatrix}, \quad \Sigma _2 = \left[ \begin{array}{ll}X_1^2 \\ X_1X_2 \\ X_2^2 \end{array}\right] , \quad \Sigma _n = \left[ \begin{array}{ll}X_1^n \\ X_1^{n-1} X_2 \\ X_1^{n-2} X_2^2 \\ X_1^{n-3} X_2^3 \end{array}\right] , \ n\geqslant 3\nonumber \\ \end{aligned}$$
(15)

forms a rigid V-basis of \({\mathcal {P}}_2\). By the discussion in [2, Section 4], the system \(\{\Sigma _n\}_{n=0}^\infty \) is a proper candidate for a rigid V-basis of \({\mathcal {P}}_2\), because its structure is the same as that of the rigid V-basis \(\{Q_n\}_{n=0}^\infty \) constructed above (i.e., the lengths of consecutive columns in \(\{\Sigma _n\}_{n=0}^\infty \) are equal to \(d_V(n)\)). Hence, it suffices to show that for every \(j,k\geqslant 0\), we have

$$\begin{aligned} \Pi _V(X_1^j X_2^k) \in {{\,\textrm{lin}\,}}\Pi _V \left( \bigcup _{l=0}^{j+k} \Sigma _l \right) . \end{aligned}$$
(16)

If \(j+k\leqslant 3\), then (16) is satisfied in an apparent way, because then \(X_1^j X_2^k\) is a member of \(\Sigma _{k+j}\). Assume that \(j+k \geqslant 4\) and \(k \geqslant 4\), which covers all the monomials of degree \(j+k\) outside of \(\Sigma _{j+k}\). Since \(X_1^2-X_2^2 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}(X_1^2+X_2^2)^2\), we infer that

$$\begin{aligned} X_2^4 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^2 - X_2^2 -X_1^4 - 2X_1^2 X_2^2. \end{aligned}$$

This implies that

$$\begin{aligned} X_1^j X_2^k = X_1^j X_2^{k-4} X_2^4 {\mathop {=}\limits ^{{\scriptscriptstyle {{ V }}}}}X_1^{j+2} X_2^{k-4} -X_1^j X_2^{k-2} -X_1^{j+4} X_2^{k-4} -2 X_1^{j+2} X_2^{k-2}. \end{aligned}$$

This way, we have reduced the powers of the variable \(X_2\) by 2. If \(k-2 \geqslant 4\), we may repeat this procedure for all the monomials with powers of \(X_2\) greater than or equal to 4, eventually obtaining the linear combination of monomials with power of \(X_2\) less than or equal to 3. Thus, we have shown (16).

We now want to assume that L is induced by a Borel measure \(\mu \) whose support is a Zariski dense subset of the lemniscate \({\varvec{B}}\), i.e., for every \(p\in {\mathcal {P}}_2\), if \({\varvec{p}}\) vanishes on the support of \(\mu \), then it vanishes on the whole \({\varvec{B}}\) (for the Zariski topology and related notions, see the discussion in [2, Section 9]). There is no loss of generality, since by [2, Theorem 43] (see also [13, Theorem 1]) for any positive definite functional \(L:{\mathcal {P}}_d\rightarrow {\mathbb {C}}\) satisfying \(L(q{\bar{q}})=0\) with some \(q\in {\mathcal {P}}_d\), such that the zero set \({\mathcal {Z}}_q:= q^{-1}(\{0\})\) is compactFootnote 4 in \({\mathbb {R}}^d\), there exists a Borel measure \(\mu \) supported in \({\mathcal {Z}}_q\), such that

$$\begin{aligned} L(p) = \int _{{\mathcal {Z}}_q} p(x)\mathrm d\mu (x), \quad p\in {\mathcal {P}}_d. \end{aligned}$$
(17)

This means, in particular, that if \(L:{\mathcal {P}}_2 \rightarrow {\mathbb {C}}\) is positive definite and \(L(p)=0\) for all \(p\in {\mathcal {P}}_2\), such that \(p\vert _B \equiv 0\), then \({\varvec{L}}\) must have integral representation (17) with \({\mathcal {Z}}_p=B\).

It is worth noting that a subset of \({\varvec{B}}\) is Zariski dense in \({\varvec{B}}\) if and only if it is infinite. Indeed, it has to be infinite, because every finite set is closed in the Zariski topology. In turn, assume that \(B_1\) is an infinite subset of \({\varvec{B}}\), and \(p\in {\mathcal {P}}_2\) vanishes on \(B_1\). Employing the parametrization

$$\begin{aligned} \varphi : {\mathbb {R}}\ni t \mapsto \left( \frac{\cos t}{1+\sin ^2 t}, \frac{\sin t \cos t}{1+\sin ^2 t} \right) \in B, \end{aligned}$$

we see that \(p \circ \varphi \) is real-analytic and the set of its zeros has an accumulation point, because there are infinitely many \(t \in [0,2\pi ]\), such that \(\varphi (t)\in B_1\). By the identity principle for real-analytic functions (see [6, Corollary 1.2.7]), we infer that \(p\circ \varphi \) is equal to zero everywhere, and hence, \({\varvec{p}}\) vanishes on \({\varvec{B}}\).

Having fixed \({\varvec{L}}\) induced by a measure whose support is a Zariski dense subset of B, we may now perform an orthonormalization procedure applied to the system (15). As usual, this can be done recursively, but it would be convenient to notice that once we have obtained columns \(Q_0\), ..., \(Q_{k-1}\) of orthonormal column polynomials (with lengths according to the structure of the system (15)), the column polynomial

$$\begin{aligned} {\widehat{Q}}_k = \Sigma _k - \sum _{j=0}^{k-1} L(\Sigma _k Q_j^{\mathop \intercal })Q_j \end{aligned}$$

is orthogonal to all \(Q_0\), ..., \(Q_{k-1}\), and hence, it is orthogonal to \({\mathcal {P}}_2^{k-1}\). Thus, we may focus on orthonormalizing coordinate polynomials of \({\widehat{Q}}_k\). One of the possible ways of achieving this is finding a real matrix \({\varvec{U}}\), such that \(U L({\widehat{Q}}_k {\widehat{Q}}_k^{\mathop \intercal }) U^{\mathop \intercal }\) is the identity matrix, which is possible as \(L({\widehat{Q}}_k {\widehat{Q}}_k^{\mathop \intercal })\) is a nonsingular positive definite matrix (see [2, Proposition 13]). Then, the formula \(Q_n = U {\widehat{Q}}_n\) gives the desired column polynomial. The Cholesky decomposition can be applied to find such a matrix \({\varvec{U}}\).

Final Remark The essential part of this paper was presented during the FAATNA20>22 conference, Matera, Italy, July 2022.