1 Introduction and Motivation

1.1 Spectral Methods

This work is motivated by spectral methods for time-dependent partial differential equations (PDEs) of the form

$$\begin{aligned} \frac{\partial u}{\partial t}={\mathcal {L}}u+f(x,u),\qquad t\ge 0,\quad x\in \Omega , \end{aligned}$$
(1.1)

where \({\mathcal {L}}\) is a well-posed linear operator, defining a strongly continuous semigroup, and \(\Omega \subseteq {\mathbb {R}}^d\), given with an initial condition for u(x, 0) and appropriate boundary conditions on \(\partial \Omega \). Standard examples are \({\mathcal {L}}=\Delta \) with \(f\equiv 0\) (the diffusion equation) or f a cubic polynomial in u with real zeros (the Fitzhugh–Nagumo equation) and \({\mathcal {L}}={\mathrm i}\Delta \) with either \(f=-{\mathrm i}V(x)\) (the linear Schrödinger equation) or \(f=-{\mathrm i}\lambda |u|^2u\) (the nonlinear Schrödinger equation with standard cubic nonlinearity).

In this paper, we are concerned by spectral methods applied in tandem with a splitting approach. As an example, we may commence by approximating locally the solution of (1.1) using the Strang splitting,

$$\begin{aligned} u^{n+1}=\textrm{e}^{\frac{1}{2}\Delta t {\mathcal {L}}} \textrm{e}^{\Delta t f} \textrm{e}^{\frac{1}{2}\Delta t {\mathcal {L}}} u^n,\qquad n\ge 0, \end{aligned}$$
(1.2)

where \(u^n(x)\) is an approximation to \(u(x,n\Delta t)\). Here \(\textrm{e}^{t{\mathcal {L}}}v\) is a shorthand for a numerical solution at t of \(\partial u/\partial t={\mathcal {L}}u\), \(u(0)=v\), while \(\textrm{e}^{tf}v\) denotes a numerical solution of the ordinary differential equation (ODE) \(\,\text {d}u/\,\text {d}t=f(x,u)\), \(u(0)=v\). The splitting (1.2), which incurs a local error of \(\mathcal{O}\!\left( (\Delta t)^3\right) \), is but one example of operatorial splittings [1, 2, 20] and is intended here to illustrate a general point, namely that the solution of ‘complicated’ PDEs can be reduced to the solution of ‘simple’ PDEs and ODEs. Once done correctly, this procedure is consistent with eventual quality of the solution, concerning accuracy and stability alike.

Another benefit of (1.2) and of similar splittings is that it is consistent with conservation of the \(\text {L}_2\) energy. Many dispersive equations, e.g. Schrödinger (linear or nonlinear), Gross–Pitaevskii, Dirac, Klein–Gordon and Korteveg–De Vries, conserve the \(\text {L}_2\) norm of the solution. This often represents a highly significant physical feature, and it is vital to respect it under discretisation. (Note that conservation of \(\text {L}_2\) norm automatically implies numerical stability.) Because of the special form of (1.2) (and of similar splittings), the overall numerical scheme preserves \(\text {L}_2\) energy if both the discretisations of \(\textrm{e}^{ t{\mathcal {L}}}\) and \(\textrm{e}^{tf}\) do so. Insofar as \(\textrm{e}^{tf}\) is concerned, we can use the very extensive and robust existing theory [8], e.g. use a symplectic method (which automatically also preserves the \(\text {L}_2\) norm). It is more challenging to ensure that \(\Vert \textrm{e}^{t{\mathcal {L}}}v\Vert _2=\Vert v\Vert _2\) for every v, in other words that the discretisation of \(\textrm{e}^{t{\mathcal {L}}}\) is unitary.

1.2 The Differentiation Matrix

In this paper, we are concerned with spectral methods for time-dependent problems [3, 9, 27]. In a nutshell, we commence from a set \(\Phi =\{\varphi _n\}_{n\in {\mathbb {Z}}_+}\), where each \(\varphi _n\) is defined in \(\Omega \) and endowed with appropriate regularity. We assume that \(\Phi \) is orthonormal in the standard \(\text {L}_2\) inner product,

$$\begin{aligned} \int _\Omega \varphi _m(x)\varphi _n(x) \,\text {d}x=\delta _{m,n},\qquad m,n\in {\mathbb {Z}}_+, \end{aligned}$$

and complete in \(\text {L}_2(\Omega )\), and expand a solution in the basis \(\Phi \),

$$\begin{aligned} u(x,t)=\sum _{n=0}^\infty u_n(t) \varphi _n(x), \end{aligned}$$

where the coefficients \(u_n(0)\)s are determined by expanding the initial condition \(u_0\), while the \(u_n(t)\)s, \(t>0\) are typically evolved by Galerkin conditions, which for (1.1) read

$$\begin{aligned} u_m'(t)=\sum _{n=0}^\infty u_n(t) \langle {\mathcal {L}}\varphi _n+f(\,\cdot \,,\varphi _n),\varphi _m\rangle ,\qquad m\in {\mathbb {Z}}_+. \end{aligned}$$

In a practical method, we truncate the expansion and the range of m, thereby obtaining a finite-dimensional linear system of ODEs.

Substantive advantage of spectral methods is that expansions in orthonormal bases typically converge very rapidly indeed: for example, orthogonal polynomials converge in a finite interval to analytic (in the interval and its neighbourhood) functions at an exponential rate. Therefore, the number of degrees of freedom, compared to the more usual finite difference or finite element methods, is substantially smaller. While this is not the entire truth—finite differences and finite elements produce sparse linear algebraic systems while spectral elements yield dense matrices which sometimes can be also ill conditioned and, moreover, expanding a function in an orthonormal basis can be potentially costly—spectral methods are often the approach of choice in numerical computations. In the specific context of time-dependent problems, however, naive spectral methods are unstable [9]. This motivates us to consider the major concept of a differentiation matrix.

In the sequel, we restrict our narrative to the univariate case, \(\Omega \subseteq {\mathbb {R}}\), for a number of reasons. Firstly, surprisingly, even the univariate case (as we hope to persuade the reader) is dramatically incomplete. Secondly, it lays the foundations to a multivariate case, whether by tensorial extension to parallelepipeds or by more advanced means which we intend to explore in a future paper.

The set \(\Phi \) being a basis of \(\text {L}_2(\Omega )\cap \text {C}^1(\Omega )\), any function therein can be expressed as a linear combination of the \(\varphi _n\)s, and this is particularly true with regard to the derivatives \(\varphi _m'\). This yields a linear map represented by the infinite-dimensional matrix \(\mathscr {D}\) such that

$$\begin{aligned} \mathscr {D}_{m,n}=\int _\Omega \varphi _m'(x) \varphi _n(x)\,\text {d}x,\qquad m,n\in {\mathbb {Z}}_+. \end{aligned}$$

It is very simple to prove by integration by parts that the differentiation operator \(\varvec{D}=\partial /\partial x\) is skew Hermitian in the following three configurations of boundary conditions:

  1. 1.

    The torus \(\Omega ={\mathbb {T}}\) (i.e. periodic boundary conditions);

  2. 2.

    The Cauchy problem: \(\Omega ={\mathbb {R}}\); and

  3. 3.

    Zero Dirichlet boundary conditions on the boundary of \(\Omega \subset {\mathbb {R}}\).

In that case \(\Vert \textrm{e}^{t\scriptstyle \varvec{D}}\Vert =1\) (unless otherwise stated, we assume in this paper the Euclidean norm.) and, \(\varvec{D}^2\) being Hermitian and negative semidefinite, \(\Vert \textrm{e}^{t\scriptstyle \varvec{D}^2}\Vert \le 1\). More generally, \(\varvec{D}^{2\ell +1}\) is skew Hermitian and \((-1)^{\ell -1} \varvec{D}^{2\ell }\) negative semidefinite for all \(\ell \in {\mathbb {Z}}_+\). Consequently, once \({\mathcal {L}}=\sum _{\ell =1}^M a_\ell \varvec{D}^{\ell }\), where \((-1)^{\ell -1} a_{2\ell }\ge 0\), it is trivial to prove that \(\langle {\mathcal {L}}u,u\rangle \le 0\) for every u in the underlying Hilbert space.

This feature is retained by a spectral method, provided that \(\mathscr {D}\) is skew Hermitian, as is its \((N+1)\times (N+1)\) section \(\mathscr {D}_N\). In the context of the PDE (1.1), it thus follows that, letting \({\mathcal {L}}_N=\sum _{\ell =1}^M a_\ell \mathscr {D}_N^{\,\ell }\), we have \(w^*{\mathcal {L}}_N w\le 0\) for all \(w\in {\mathbb {C}}^{M+1}\). A consequence is that the \(\text {L}_2\) energy is conserved for \({\mathcal {L}}=\varvec{D}\) (the Schrödinger case) and it dissipates for \({\mathcal {L}}=\varvec{D}^2\) (the diffusion equation case). In both cases, numerical stability comes in the wash.

1.3 Few Examples

This is the right moment to expand further on skew symmetry and differentiation matrices in a practical setting by means of few examples, restricting ourselves for simplicity to a single space dimension. Firstly, the reaction–diffusion equation

$$\begin{aligned} \frac{\partial u}{\partial t}=\frac{\partial ^2 u}{\partial x^2}+f(u), \end{aligned}$$

where f is a low-degree polynomial. In a finite-dimensional setting, the equation is replaced by \(\varvec{u}'=\mathscr {D}^*\mathscr {D}\, \varvec{u}+f(\varvec{u})\) and, provided \(\mathscr {D}\) is skew Hermitian, \(\mathscr {D}^*\mathscr {D}\) is negative semidefinite. Once we use, e.g. the Strang splitting (1.2), the diffusion component \(\mathscr {D}^{\,2}\) is assured to be dissipative. Another example is the convection–diffusion equation for incompressible flow (assuming, for simplicity, constant vector field),

$$\begin{aligned} \frac{\partial u}{\partial t}=\frac{\partial ^2 u}{\partial x^2}+c\frac{\partial u}{\partial x}, \end{aligned}$$

where \(c\in {\mathbb {R}}\). Its semidiscretisation is \(\varvec{u}'=(\mathscr {D}^*\mathscr {D}+c\mathscr {D})\varvec{u}\) and, since for skew-Hermitian matrix \(\mathscr {D}\)

$$\begin{aligned} \frac{1}{2}[(\mathscr {D}^*\mathscr {D}+c\mathscr {D})+(\mathscr {D}^*\mathscr {D}+c\mathscr {D})^*]=\mathscr {D}^*\mathscr {D}, \end{aligned}$$

a negative-semidefinite matrix, stability is assured.

Yet another example (and the underlying motivation to the work that has led to this paper) is dispersive equations of the form

$$\begin{aligned} {\mathrm i}\frac{\partial u}{\partial t}=-\Delta u+f(x,u) \end{aligned}$$

where \(f(x,u)=V(x)u\) for linear Schrödinger equation, \(f(x,u)=\lambda |u|^2u\) for standard nonlinear Schrödinger equation and \(f(x,u)=[V(x)+\lambda |u|^2]u\) for the Gross–Pitaevskii equation. All these equations conserve the \(\text {L}_2\) energy \(\Vert u\Vert \) (often known as ‘mass’ in this context), and this is a fundamental physical invariant. Once the equation is semi-discretised in the form \({\mathrm i}\varvec{u}'=-\mathscr {D}^*\mathscr {D}\,\varvec{u}+\varvec{F}(\varvec{u})\) and solved, e.g. with the Strang splitting (1.2), the mass is conserved because \({\mathrm i}\mathscr {D}^*\mathscr {D}\) is skew Hermitian.

Our last example is the KdV equation

$$\begin{aligned} \frac{\partial u}{\partial t}+\frac{\partial ^3u}{\partial x^3}-6u\frac{\partial u}{\partial x}=0. \end{aligned}$$

Strang splitting (1.2) means that we solve separately \(\partial \varvec{u}/\partial t=-\mathscr {D}^3 \varvec{u}\) and scalar Burgers equations \(\partial u_m/\partial t=-u_m (\mathscr {D}\varvec{u})_m\). In each case, once \(\mathscr {D}\) is skew Hermitian, the solution is stable and \(\text {L}_2\) energy conserved.

An important take-away lesson of our four examples is that we typically work (for real \(\mathscr {D}\)) with a specific power (or powers) of the differentiation matrix, predicated by the space derivatives present in the differential equation. In particular, we require not just \(\mathscr {D}\) but its specific powers to be bounded.

1.4 An Orthonormal Basis

The obvious choice of an orthonormal system is a set of orthogonal polynomials—unless we use (possibly shifted) Legendre polynomials, this means replacing the \(\text {L}_2\) inner product by another, defined by the orthogonality weight function—but it is clear that this produces a lower triangular \(\mathscr {D}\). While there are nontrivial ways round it [22], there is strong motivation to consider alternative orthonormal systems.

The periodic case—without loss of generality, \(\Omega =[-\pi ,\pi ]\) with periodic boundary conditions—is obvious: we let \(\Phi =\{\textrm{e}^{{\mathrm i}nx}\}_{n\in {\mathbb {Z}}}\), the Fourier basis. An added bonus is fast expansion by means of a Fast Fourier Transform of any \(\text {L}_2[-\pi ,\pi ]\cap \text {C}_{\text {per}}[-\pi ,\pi ]\) function in the underlying basis. This is the paradigmatic case whereby a spectral method has few competitors.

The Cauchy case \(\Omega =(-\infty ,\infty )\) has been the subject of an extensive recent study [11,12,13,14]. In particular, all orthonormal systems \(\Phi \) such that \(\mathscr {D}\) is skew Hermitian and tridiagonal have been completely characterised. Specifically, they are in a one-to-one relationship with Borel measures, supported on the entire real line. Let \(\,\text {d}\mu (x)=w(x)\,\text {d}x\) be such a measure (w might be a generalised function) and \(\mathscr {P}=\{p_n\}_{n\in {\mathbb {Z}}_+}\) the underlying set of orthonormal polynomials. It is elementary that \(\mathscr {P}\) obeys a three-term recurrence relation

$$\begin{aligned} \beta _n p_{n+1}(x)=(x+\alpha _n)p_n(x)-\beta _{n-1}p_{n-1}(x),\qquad n\in {\mathbb {Z}}_+, \end{aligned}$$
(1.3)

where \(\beta _{-1}=0\), \(\alpha _n\in {\mathbb {R}}\) and \(\beta _n>0\) for \(n\in {\mathbb {Z}}_+\). Inverse Fourier transforming \(\{w^{1/2} p_n\}_{n\in {\mathbb {Z}}_+}\) and multiplying the n term by \({\mathrm i}^n\), we obtain an orthonormal set \(\Psi \), dense in \(\text {L}_2({\mathbb {R}})\) and such that

$$\begin{aligned} \psi _n'=-\beta _{n-1}\psi _{n-1}+{\mathrm i}\alpha _n \psi _n+\beta _n\psi _{n+1},\qquad n\in {\mathbb {Z}}_+. \end{aligned}$$

In other words, \(\psi _n={\mathrm i}^n {\mathcal {F}}^{-1}(w^{1/2}p_n)\). Therefore, \(\mathscr {D}\) is skew Hermitian (skew symmetric if \(\alpha _n\equiv 0\), which is the case once w is an even function) and tridiagonal, while orthonormality follows from the Plancherel theorem. Tridiagonality is a valuable feature because it is easy to manipulate \(\mathscr {D}\) (e.g. multiply \(\mathscr {D}_N\) by a vector or approximate \(\exp (t\mathscr {D}_N)\)) and the powers of the infinite-dimensional matrix \(\mathscr {D}\) (approximating higher derivatives) remain bounded.

This leaves us with the third—and most difficult—case, namely zero Dirichlet boundary conditions.Footnote 1 This is the subject of this paper.

A natural inclination is to extend the Fourier transform-based theory from \((-\infty ,\infty )\) to, say, \((-1,1)\). This can be done in one of two obvious ways and, unfortunately, both fail. The first is to choose a measure \(\,\text {d}\mu \) supported by \((-1,1)\), but this leads again to \(\Phi \) supported on the entire real line, the only difference being that in this case the closure of \(\Phi \) is not \(\text {L}_2(-\infty ,\infty )\) but a Paley–Wiener space [11]. Another possibility is to abandon altogether the Fourier route and alternatively commence by specifying a \(\varphi _0\), subsequently determining the \(\varphi _n\)s for \(n\in {\mathbb {N}}\) and the matrix \(\mathscr {D}\) consistently with both orthogonality and tridiagonality. A forthcoming paper demonstrates how to do this algorithmically. However, given \(\varphi '=\mathscr {D}\,\varphi \), it follows by induction that \(\varphi ^{(s)}=\mathscr {D}^{\,s}\varphi \) for all \(s\in {\mathbb {Z}}_+\). Consistency with zero Dirichlet boundary conditions, though, requires \(\varphi _n(\pm 1)\equiv 0\), and this implies that \(\varphi _n^{(s)}(\pm 1)\equiv 0\) for all \(n,s\in {\mathbb {Z}}_+\). If \(\varphi _0\) is analytic in \((-1,1)\), this means that it necessarily must have an essential singularity at the endpoints. Intuitively, this is bad news, and this is confirmed by numerical experiments that indicate that the \(\varphi _n\)s develop boundary layers and wild oscillations near \(\pm 1\) and their approximation power is nil.

Both ideas above fall short and the current paper embarks on an altogether different approach, abandoning tridiagonality and the Fourier route altogether. Note that the existence of an essential singularity at the endpoints hinged on the fact that all powers of the infinite matrix \(\mathscr {D}\) are bounded. This is obvious once \(\mathscr {D}\) is tridiagonal (or, more generally, banded); hence, our main idea is to choose an orthonormal set \(\mathscr {D}\) such that \(\mathscr {D}^{\,s+1}\) blows up for some \(s\in {\mathbb {N}}\). At the same time, we wish to retain a major blessing of tridiagonality, namely that \(\mathscr {D}_N w\) can be computed in \(\mathcal{O}\!\left( N\right) \) operations for any \(w\in {\mathbb {C}}^{N+1}\) and, more generally, that \(\mathscr {D}_N\) is amenable for fast linear algebra.

1.5 Plan of this Paper

The main idea underlying this paper is exceedingly simple: given a measure \(\,\text {d}\mu =w\,\text {d}x\), where \(w\in {\mathbb {C}}^1(a,b)\), and an underlying set \(\mathscr {P}=\{p_n\}_{n\in {\mathbb {Z}}_+}\) of orthonormal polynomials, we set

$$\begin{aligned} \varphi _n(x)=\sqrt{w(x)}p_n(x),\qquad x\in (a,b). \end{aligned}$$
(1.4)

It follows at once that \(\Phi \) is orthonormal with respect to \(\text {L}_2(a,b)\) and it is easy to determine conditions so that \(\varphi _n(a)=\varphi _n(b)=0\) for all \(n\in {\mathbb {Z}}_+\). It is not difficult to specify the conditions on w for skew symmetry of \(\mathscr {D}\). However, the narrative becomes more complicated once we seek a system such that \(\mathscr {D}^{\,k}\) is bounded for \(k=1,\ldots ,s\) and blows up for \(k=s+1\). Likewise, it is considerably more challenging to identify systems \(\Phi \) that allow for fast computation of \(\mathscr {D}_N w\) for \(w\in {\mathbb {C}}^{N+1}\).

In Sect. 2, we introduce the functions (1.4) in a more rigorous setting of Sobolev spaces and explore general properties of their differentiation matrices. Section 3 is devoted to two families of weight functions, namely the Laguerre family \(w_\alpha (x)=x^\alpha \textrm{e}^{-x}\chi _{(0,\infty )}(x)\) and the ultraspherical family \(w_\alpha (x)=(1-x^2)^\alpha \chi _{(-1,1)}(x)\). We prove that both families have a separable differentiation matrix. This feature (put to a good use in Sect. 4) is very special—indeed, there are good reasons to conjecture that these two families are the only weights with this feature. We present a detailed example of two orthogonal families, generalised Hermite weights \(w_\mu (x)=|x|^{2\mu }\textrm{e}^{-x^2}\) and Konoplev weights \(w_{\alpha ,\beta }(x)=|x|^{2\beta +1}(1-x^2)^\alpha \chi _{(-1,1)}(x)\), and prove that their differentiation matrices cannot be separable unless (for Konoplev weights) \(\beta =-\frac{1}{2}\) and the weight reduces to ultraspherical. Note, however, that separability is just one feature lending itself to fast numerical algebra and one cannot rule out that other weights might lead to differentiation matrices which, while non-separable, are amenable to fast computation.

Finally, in Sect. 4 we demonstrate how separability of the differentiation matrix can be utilised for fast multiplication of \(\mathscr {D}_N w\), \(w\in {\mathbb {R}}^{N+1}\), in \(\mathcal{O}\!\left( N\right) \) operations.

The original idea to consider functions of the form (1.4) in the specific case of Freud weights has been considered first by Luong [19], who demonstrated that in this specific case \(\mathscr {D}\) is a skew-symmetric, banded matrix with bandwidth seven. This was a serendipitious choice: in Sect. 2, we prove that the only weights that produce a banded matrix in the setting of (1.4) are generalised Freud weights!

2 W-Functions

2.1 The Definition and Some of Its Consequences

Let (ab) be a non-empty real interval, \(-\infty \le a<b\le \infty \), and \(s\in {\mathbb {N}}\cup \{\infty \}\). We denote by the Sobolev space of \(\text {H}_2^s[a,b]\) functions f such that

$$\begin{aligned} f^{(k)}(a)=f^{(k)}(b)=0,\qquad k=0,\ldots ,s-1, \end{aligned}$$

(note that \(\text {C}^{s-1}[a,b]\subset \text {H}_2^s[a,b]\), therefore the derivatives are well defined) equipped with the inner product

$$\begin{aligned} \langle f,g\rangle _s=\sum _{k=0}^{s} \int _a^b f^{(k)}(x) g^{(k)}(x)\,\text {d}x. \end{aligned}$$

A weight function \(w\in \text {L}_2(a,b)\cap \text {C}^1(a,b)\) is a positive function with all its moments

$$\begin{aligned} \mu _k=\int _a^b x^k w(x)\,\text {d}x,\qquad k\in {\mathbb {Z}}_+, \end{aligned}$$

bounded. Given a weight functions, we can define (e.g. using a Gram–Schmidt process) a set of orthonormal polynomials \(\mathscr {P}=\{p_n\}_{n\in {\mathbb {Z}}_+}\) such that

$$\begin{aligned} \int _a^b p_m(x)p_n(x) w(x)\,\text {d}x=\delta _{m,n},\qquad m,n\in {\mathbb {Z}}_+. \end{aligned}$$
(2.1)

Such a set is unique once we require, for example, that the coefficient of \(x^n\) in \(p_n\) is always positive. We say that \(\varphi _n\) is the nth W-function once

$$\begin{aligned} \varphi _n(x)=\sqrt{w(x)} p_n(x),\qquad n\in {\mathbb {Z}}_+, \end{aligned}$$

and let \(\Phi =\{\varphi _n\}_{n\in {\mathbb {Z}}_+}\). It follows at once from (2.1) that \(\Phi \) is an orthonormal set with respect to the standard \(\text {L}_2\) inner product.

Remark 1

The functions \(\varphi _n\) inherit some features of orthonormal polynomials; in particular, they obey the same three-term recurrence relation. However, the expansion coefficients of an arbitrary function are different:

$$\begin{aligned} f\sim & {} \sum _{n=0}^\infty {\hat{f}}_n^P p_n,\quad {\hat{f}}_n^P=\int _a^b f(x) p_n(x) w(x)\,\text {d}x,\qquad f\in \text {L}_2((a,b),w\,\text {d}x),\\ f\sim & {} \sum _{n=0}^\infty {\hat{f}}_n^\Phi \varphi _n,\quad {\hat{f}}_n^\Phi =\int _a^b f(x) p_n(x)\sqrt{w(x)}\,\text {d}x,\qquad f\in \text {L}_2(a,b). \end{aligned}$$

Remark 2

An important difference between \(\mathscr {P}\) and \(\Phi \) is that while the \(p_n\) are polynomials, hence analytic functions, the W-functions carry over potential singularities of the weight function. For example, for the Chebyshev weight function \(w(x)=(1-x^2)^{-1/2}\chi _{(-1,1)}(x)\) the \(\varphi _n\)s have weak singularity at the endpoints \(\pm 1\), while their derivatives possess strong singularity there.

We let \(\mathscr {D}\) stand for the infinite-dimensional differentiation matrix

$$\begin{aligned} \mathscr {D}_{m,n}=\int _a^b \varphi _m'(x)\varphi _n(x)\,\text {d}x,\qquad m,n\in {\mathbb {Z}}_+. \end{aligned}$$
(2.2)

We say that w is of index \(s\in {\mathbb {N}}\cup \{\infty \}\) and denote this by \(\text {ind}\, w=s\) if \(\mathscr {D}^{\,k}\) is bounded for \(k=1,\ldots ,s\), while \(\mathscr {D}^{\,s+1}\) is unbounded.

Lemma 1

\(\mathscr {D}\) is skew symmetric if and only if \(w(a)=w(b)=0\).

Proof

Assume first that \(-\infty<a<b<\infty \) and note that \(\mathscr {D}\) is skew symmetric if and only if \(\mathscr {D}_{m,n}+\mathscr {D}_{n,m}=0\), \(m,n\in {\mathbb {Z}}_+\). Since

$$\begin{aligned} \varphi _n'=\sum _{k=0}^\infty \mathscr {D}_{n,k} \varphi _k,\qquad n\in {\mathbb {Z}}_+, \end{aligned}$$

it follows from (2.2) and the orthonormality of \(\Phi \) that \(\mathscr {D}\) is skew symmetric if

$$\begin{aligned} \int _a^b \frac{\,\text {d}\sqrt{w(x)} p_m(x)}{\,\text {d}x} \sqrt{w(x)}p_n(x) \,\text {d}x+\int _a^b \sqrt{w(x)}p_m(x)\frac{\,\text {d}\sqrt{w(x)}p_n(x)}{\,\text {d}x} \,\text {d}x=0 \end{aligned}$$

for \(m,n\in {\mathbb {Z}}_+\). The latter is equivalent, for every \(m,n\in {\mathbb {Z}}_+\), to

All the zeros of orthogonal polynomials reside in (ab). Therefore, they cannot vanish at the endpoints and \((-1)^k p_k(a)p_k(b)>0\), \(k\in {\mathbb {N}}\); hence, \(p_m(a)p_n(a)\) cannot equal \(p_m(b)p_n(b)\) for all \(m,n\in {\mathbb {N}}\). We deduce that \(\mathscr {D}\) is skew symmetric if and only if \(w(a)=w(b)=0\).

The proof is similar—in fact, somewhat simpler—once either \(b=\infty \) or \(a=-\infty \). If \((a,b)=(-\infty ,\infty )\), then \(\text {L}_2\) boundedness and continuity imply \(w(\pm \infty )=0\), \(\mathscr {D}\) is skew symmetric, and there is nothing to prove. \(\square \)

Consequently, we impose an additional condition on the weight function, namely that it vanishes at the endpoints. Note that this is automatically true once an endpoint is infinite.

A quintessential example of a family of W-functions is Hermite functions

$$\begin{aligned} \varphi _n(x)=\frac{\textrm{e}^{-x^2/2}}{\sqrt{2^nn!\sqrt{\pi }}} \text {H}_n(x),\qquad n\in {\mathbb {Z}}_+, \end{aligned}$$

where the \(\text {H}_n\)s are standard Hermite polynomials. Hermite functions are well known in mathematical physics, because they are eigenfunctions of the free Schrödinger operator. They can be derived from orthonormalised Hermite polynomials via the Fourier transform route, as mentioned in introduction and e.g. in Arieh Iserles and Marcus Webb [11]; hence, their differentiation matrix is tridiagonal. On the other hand, they are W-functions with \(w(x)=\textrm{e}^{-x^2}\). Tridiagonality implies that \(\text {ind}\,w=\infty \). More generally, \(\text {ind}\,w=\infty \) once \(\mathscr {D}\) is a banded matrix and it is interesting to characterise all weight functions with this feature.

Theorem 2

The differentiation matrix \(\mathscr {D}\) of a system of W-functions is banded if and only if \(w(x)=\textrm{e}^{-c(x)}\), \(x\in {\mathbb {R}}\), where c is an even-degree polynomial whose highest degree coefficient is strictly positive.

Proof

Letting \(m\le n-1\), orthogonality implies that

$$\begin{aligned} \mathscr {D}_{m,n}=\int _a^b w\!\left( \frac{1}{2} \frac{w'}{w}p_m+p_m'\right) \!p_n\,\text {d}x=\frac{1}{2} \int _a^b w'p_m p_n\,\text {d}x \end{aligned}$$

while for \(m\ge n+1\) skew symmetry yields

$$\begin{aligned} \mathscr {D}_{m,n}=-\frac{1}{2} \int _a^b w'p_m p_n\,\text {d}x. \end{aligned}$$
(2.3)

Assume that \(\mathscr {D}\) has bandwidth \(2L+1\), in other words that \(\mathscr {D}_{m,n}=0\) for \(|m-n|\ge L+1\). It follows for \(m\le n-1\) that

$$\begin{aligned} \int _a^b w X_m p_n\,\text {d}x=0,\quad n\ge m+L+1\qquad \text{ where }\qquad X_m=\frac{w'}{w}p_m. \end{aligned}$$

If in addition \(n\ge L\), expanding \(X_m\) in the basis \(\mathscr {P}\) it follows that \(X_m\) is a polynomial of degree \(m+L+1\). However,

$$\begin{aligned} w'=\frac{X_m}{p_m}w\qquad \rightarrow \qquad w(x)=w(x_0)\exp \left( \int _{x_0}^x \frac{X_m(y)}{p_m(y)}\,\text {d}y\right) \end{aligned}$$

for some \(x_0\). Since w is independent of m, necessarily \(p_m\) divides \(X_m\) and the remainder c is a polynomial independent of m. Therefore, without loss of generality \(w(x)=\textrm{e}^{-c(x)}\) where c is a polynomial of degree \(L+1\). w being integrable and \(w(a)=w(b)=0\), necessarily \(a=-\infty \), \(b=\infty \),Footnote 2L is odd and c is an even-degree polynomial with strictly positive leading-degree coefficient. \(\square \)

We have recovered precisely the W-functions associated with generalised Freud polynomials that have been originally considered in Luong [19]. However, such W-functions are of little interest within the context of this paper, since we seek weight functions of finite index.

This is the point to note the expression (2.3) for the elements of \(\mathscr {D}\) such that \(m\ge n+1\). (If \(m\le n-1\) we need to flip the sign.) We will make much use of it in the sequel.

2.2 The Boundedness of \(\mathscr {D}^{\,s}\)

We assume in this section that the weight w is strictly positive in (ab), as smooth in [ab] as needed in our construction, and set

$$\begin{aligned} q_j(x)=\frac{\,\text {d}^j \sqrt{w(x)}}{\,\text {d}x^j},\qquad j\in {\mathbb {Z}}_+. \end{aligned}$$

Therefore,

$$\begin{aligned} \varphi _m^{(\ell )}=\sum _{j=0}^\ell {\ell \atopwithdelims ()j} q_j p_m^{(\ell -j)},\qquad \ell ,m\in {\mathbb {Z}}_+. \end{aligned}$$

As long as \(\mathscr {D}^{\,s}\) is bounded, we have

$$\begin{aligned} (\mathscr {D}^{\,s})_{m,n}=\int _a^b \varphi _m^{(s)}\varphi _n\,\text {d}x=\sum _{j=0}^s {s\atopwithdelims ()j} \int _a^b \sqrt{w} q_j p_m^{(s-j)}p_n\,\text {d}x. \end{aligned}$$
(2.4)

It is trivial to prove that

$$\begin{aligned} q_r=\sum _{j=1}^r \frac{U_{r,j}}{w^{j-\frac{1}{2}}}, \end{aligned}$$

where each \(U_{r,j}\) is a linear combination of products of the form \(\prod _i w^{(\ell _i)}\) such that \(\ell _i\ge 1\) and \(\sum _i \ell _i=r\): for example,

$$\begin{aligned} U_{4,1}=\frac{1}{2} w^{(4)},\qquad U_{4,2}=-\frac{3}{4} {w''}^2-w'w''',\qquad U_{4,3}=\frac{9}{4} {w'}^2w'',\qquad U_{4,4}=-\frac{15}{16}{w'}^4. \end{aligned}$$

In general, \(U_{r,r}=(-1)^{r-1} (2r)! {w'}^r/(4^rr!)\), \(r\in {\mathbb {N}}\).

Since \(w(x)>0\) in (ab), the only possible source of singularity in (2.4) is that \(\varphi _m^{(s)}\) is non-integrable at an endpoint. Recalling that \(w(a)=w(b)=0\), integrability is lost exclusively when dividing by a power of w, and the larger the power, the more significant the singularity. In other words, \(\varphi _m^{(s)}\) is bounded for all \(m\in {\mathbb {Z}}_+\) only if the integral

$$\begin{aligned} \int _a^b \sqrt{w} q_s p\,\text {d}x \end{aligned}$$

is bounded for any polynomial p, and this is contingent on \({\tilde{w}}_s={w'}^s/w^{s-1}\) being a signed weight function, i.e. all its moments exist and \({\tilde{w}}_s\not \equiv 0\).Footnote 3 The following theorem is thereby true.

Theorem 3

A necessary condition for \(\text {ind}\,w\ge s\) is that \({\tilde{w}}_r\), \(r=2,\ldots ,s\), are signed weight functions.

Let \(-\infty<a<b<\infty \). Regularity and \(w(a)=w(b)=0\) imply that

$$\begin{aligned} w(x)=(x-a)^\alpha (b-x)^\beta v(x),\qquad x\in [a,b],\quad v(a),v(b)\ne 0. \end{aligned}$$
(2.5)

Therefore, after elementary algebra,

$$\begin{aligned} {\tilde{w}}_s=(x-a)^{\alpha -s}(b-x)^{s-\beta } v \left[ (\alpha b+\beta a)-(\alpha +\beta )x+(x-a)(b-x)\frac{v'}{v}\right] ^{\!s}\!. \end{aligned}$$

Theorem 4

A necessary condition for \(\text {ind}\, w\ge s\) in a finite interval (ab) is that \(\alpha ,\beta >s-1\). Likewise, once \(b=\infty \), we need \(\alpha >s-1\) and for \(a=-\infty \) the condition is \(\beta >s-1\).

Proof

Consistently with our assumptions, \(v\ne 0\) in [ab]; therefore, the only source of singularity may come from \((x-a)^{\alpha -s}\) and \((b-x)^{\beta -s}\). We conclude that, for \(\mathscr {D}^{\,s}\) to be bounded, we need \(\alpha ,\beta >s-1\). The semi-infinite cases follow in an identical (and simpler!) manner. \(\square \)

In the special case \(s=2\), we can complement Theorem 3 with a sufficient condition.

Theorem 5

A necessary and sufficient condition for \(\text {ind}\,w\ge 2\) is that \({\tilde{w}}_2={w'}^2/w\) is a signed measure.

Proof

We compute \(\mathscr {D}^{\,2}\) directly. Using skew symmetry,

$$\begin{aligned} \mathscr {D}_{m,n}^{\,2}=\sum _{\ell =0}^\infty \mathscr {D}_{m,\ell }\mathscr {D}_{\ell ,n}=-\sum _{\ell =0}^{n-1} \mathscr {D}_{m,\ell }\mathscr {D}_{n,\ell } +\sum _{\ell =n+1}^{m-1} \mathscr {D}_{m,\ell }\mathscr {D}_{\ell ,n} -\sum _{\ell =m+1}^\infty \mathscr {D}_{\ell ,m}\mathscr {D}_{\ell ,n}. \end{aligned}$$

Recalling (2.3), let us consider the infinite sum

$$\begin{aligned} -\sum _{\ell =m+1}^\infty \mathscr {D}_{\ell ,m}\mathscr {D}_{\ell ,n}= & {} -\frac{1}{4} \int _a^b \int _a^b w'(x)w'(y)p_m(x)p_n(y)\sum _{\ell =m+1}^\infty p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y\\= & {} -\frac{1}{4} \int _a^b \int _a^b w'(x)w'(y)p_m(x)p_n(y)\sum _{\ell =0}^\infty p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y\\{} & {} +\frac{1}{4} \sum _{\ell =0}^m \int _a^b w'(x)p_m(x) p_\ell (x)\,\text {d}x\int _a^b w'(y)p_n(y) p_\ell (y) \,\text {d}y\\= & {} -\frac{1}{4} \int _a^b \int _a^b w'(x)w'(y)p_m(x)p_n(y)\sum _{\ell =0}^\infty p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y\\{} & {} + \sum _{\ell =0}^{n-1} \mathscr {D}_{m,\ell }\mathscr {D}_{n,\ell } -\sum _{\ell =m+1}^{n-1} \mathscr {D}_{m,\ell }\mathscr {D}_{\ell ,n}. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathscr {D}_{m,n}^{\,2}=-\frac{1}{4} \int _a^b\! \int _a^b w'(x)w'(y)p_m(x)p_n(y)\sum _{\ell =0}^\infty p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y. \end{aligned}$$
(2.6)

Let \(\mathscr {P}\) be orthonormal and complete in \(\text {L}_2((a,b),w\,\text {d}x)\) and \(f\in \text {L}_2((a,b),w\,\text {d}x)\). Then

$$\begin{aligned} f(x)=\sum _{m=0}^\infty {\hat{f}}_m p_m(x),\qquad \text{ where }\qquad {\hat{f}}_m=\int _a^b w(x)f(x)p_m(x)\,\text {d}x. \end{aligned}$$

Moreover, by the Parseval theorem,

$$\begin{aligned} \int _a^b w(x)|f(x)|^2\,\text {d}x=\Vert f\Vert ^2=\sum _{m=0}^\infty |{\hat{f}}_m|^2. \end{aligned}$$
(2.7)

Since

$$\begin{aligned} |{\hat{f}}_m|^2=\int _a^b \!\int _a^b w(x)w(y)f(x)\overline{f(y)} p_m(x)p_m(y)\,\text {d}x\,\text {d}y, \end{aligned}$$

exchanging summation and integration we have

$$\begin{aligned} \Vert f\Vert ^2=\int _a^b\!\int _a^b w(x)w(y)f(x)\overline{f(y)} \sum _{m=0}^\infty p_m(x) p_m(y) \,\text {d}x\,\text {d}y. \end{aligned}$$

Let

$$\begin{aligned} K(x,y)=\sqrt{w(x)w(y)} \sum _{m=0}^\infty p_m(x)p_m(y) \end{aligned}$$

be the Christoffel–Darboux kernel. It now follows from (2.7) that for every \(f\in \text {L}_2((a,b),w\,\text {d}x)\) it is true that

$$\begin{aligned} \int _a^b w(x) |f(x)|^2\,\text {d}x=\int _a^b\int _a^b \sqrt{w(x)w(y)} f(x)\overline{f(y)} K(x,y)\,\text {d}x\,\text {d}y \end{aligned}$$

and we deduce that

$$\begin{aligned} K(x,y)=\delta _{x-y}. \end{aligned}$$
(2.8)

In other words, K is a reproducing kernel.Footnote 4

We now return to (2.6), deducing that

$$\begin{aligned} \mathscr {D}_{m,n}^{\,2}= & {} -\frac{1}{4} \int _a^b \int _a^b \frac{w'(x)w'(y)}{\sqrt{w(x)w(y)}} p_m(x)p_n(y) K(x,y)\,\text {d}x\,\text {d}y\\= & {} -\frac{1}{4} \int _a^b \frac{{w'}^2(x)}{w(x)} p_m(x)p_n(x) \,\text {d}x=-\frac{1}{4} \int _a^b {\tilde{w}}_2(x) p_m(x)p_n(x) \,\text {d}x. \end{aligned}$$

This is bounded because \({\tilde{w}}_2\) is a signed measure and \(p_mp_n\) a polynomial, and we deduce that \(\text {ind}\,w\ge 2\). The necessity of \({\tilde{w}}_2\) being a signed measure is obvious from the argument that led to Theorem 4. \(\square \)

3 Separable Systems

In this section, we consider two families of weight functions that share a hugely beneficial attribute of separability, and we also provide two examples of weights that lack this feature.

We say that a weight function w is separable if there exist real sequences \({\mathfrak {a}}=\{{\mathfrak {a}}_n\}_{n\in {\mathbb {Z}}_+}\) and \({\mathfrak {b}}=\{{\mathfrak {b}}_n\}_{n\in {\mathbb {Z}}_+}\) such that

$$\begin{aligned} \mathscr {D}_{m,n}= \left\{ \begin{array}{ll} {\mathfrak {a}}_m{\mathfrak {b}}_n, &{} \qquad m\ge n+1,\\ 0, &{} \qquad m=n,\\ -\mathfrak {a}_n\mathfrak {b}_m, &{} \qquad m\le n-1, \end{array}\right. \qquad m,n\in {\mathbb {Z}}_+ \end{aligned}$$
(3.1)

and it is symmetrically separable subject to the existence of real sequences \({\mathfrak {a}}=\{{\mathfrak {a}}_n\}_{n\in {\mathbb {Z}}_+}\) and \({\mathfrak {b}}=\{{\mathfrak {b}}_n\}_{n\in {\mathbb {Z}}_+}\) such that

$$\begin{aligned} \mathscr {D}_{m,n}= \left\{ \begin{array}{ll} -{\mathfrak {a}}_m{\mathfrak {b}}_n &{} \qquad m+n\ \text{ odd, }\; m\ge n+1,\\ 0, &{} \qquad m+n\ \text{ even },\\ {\mathfrak {a}}_m{\mathfrak {b}}_n, &{} \qquad m+n\ \text{ odd, }\; m\le n-1, \end{array}\right. \qquad m,n\in {\mathbb {Z}}_+ \end{aligned}$$
(3.2)

It is demonstrated in Sect. 4 that separability or symmetric separability allows for very rapid computation of products of the form \(\mathscr {D}_N\varvec{v}\) for \(\varvec{v}\in {\mathbb {R}}^{N+1}\). This is intimately related to earlier results on fast computation of some structured algebraic systems in Refs. [6, 7].

In this section, we consider two families of measures, one separable and the other symmetrically separable: the Laguerre weight \(w(x)=x^\alpha \textrm{e}^{-x}\chi _{(0,\infty )}(x)\) and the ultraspherical weight \((1-x^2)^\alpha \chi _{(-1,1)}(x)\), respectively. In a way, they are the most obvious measures in intervals of the form \((0,\infty )\) and \((-1,1)\), respectively. Yet, interestingly, separability appears to be a very rare feature and we provide counterexamples further in this section.

In both Laguerre and ultraspherical cases, we are able to present comprehensive analysis, deriving the sequences \({\mathfrak {a}},{\mathfrak {b}}\) explicitly, determining \(\text {ind}\,w\) and (in Sect. 4) discussing the optimal choice of the parameter \(\alpha \). While both Laguerre and ultraspherical polynomials have been comprehensively studied, the separability and the formulæ (3.1) and (3.2) are, to the best of author’s knowledge, new.

3.1 The Laguerre Family

Laguerre polynomials are orthogonal with respect to the Laguerre weight,

$$\begin{aligned} \int _0^\infty x^\alpha \textrm{e}^{-x} \text {L}_m^{(\alpha )}(x)\text {L}_n^{(\alpha )}(x)\,\text {d}x=\frac{{\Gamma }(n+1+\alpha )}{n!}\delta _{m,n},\qquad m,n\in {\mathbb {Z}}_+,\quad \alpha >-1. \end{aligned}$$

[23, p. 206]. In our case, we consider just the case \(\alpha >0\), so that the weight function vanishes at the origin. We have

$$\begin{aligned} p_n(x)= & {} \sqrt{\frac{n!}{{\Gamma }(n+1+\alpha )}} \text {L}_n^{(\alpha )}(x),\\ \varphi _n(x)= & {} \sqrt{\frac{n!}{{\Gamma }(n+1+\alpha )}} x^{\alpha /2}\textrm{e}^{-x/2}\text {L}_n^{(\alpha )}(x),\qquad n\in {\mathbb {Z}}_+. \end{aligned}$$

In Theorem 12, we determine that the Laguerre weight is separable—the proof requires a fair bit of algebraic computation and is relegated to Appendix A. The separability coefficients are given in (A.8), which we repeat here for clarity,

$$\begin{aligned} {\mathfrak {a}}_m=\sqrt{\frac{m!}{2{\Gamma }(m+1+\alpha )}}\sim \frac{1}{m^{\alpha /2}},\quad {\mathfrak {b}}_n=\sqrt{\frac{{\Gamma }(n+1+\alpha )}{2n!}}\sim n^{\alpha /2},\qquad m,n\in {\mathbb {Z}}_+. \end{aligned}$$
(3.3)

Note that

$$\begin{aligned} {\mathfrak {a}}_m{\mathfrak {b}}_m\equiv \frac{1}{2},\qquad m\in {\mathbb {Z}}_+ \end{aligned}$$
(3.4)

this will be important in the sequel.

Theorem 4 presents a necessary condition for \(\text {ind}\, w\ge s\) for \(s\ge 2\): for a Laguerre weight \(w=w_\alpha \) it translates to \(\alpha >s-1\). In the remainder of this subsection, we wish to prove that for the Laguerre weight function this condition is also sufficient.

The matrix \(\mathscr {D}^{\,s}\) is absolutely bounded for \(s\ge 0\) if

$$\begin{aligned} \sum _{k_1=0}^\infty \sum _{k_2=0}^\infty \cdots \sum _{k_{s-1}=0}^\infty |\mathscr {D}_{m,k_1}\mathscr {D}_{k_1,k_2}\cdots \mathscr {D}_{k_{s-2},k_{s-1}} \mathscr {D}_{k_{s-1},n}|<\infty , \qquad m,n\in {\mathbb {Z}}_+. \end{aligned}$$
(3.5)

It is clear that absolute boundedness implies boundedness.

We assume that \(m\ge n+1\) and observe that everything depends on the interplay of the relative sizes of \(k_0=m,k_1,k_2,\ldots ,k_{s-1},k_s=n\) because, for example,

$$\begin{aligned} k_j>k_{j+1}\quad \Rightarrow \quad |\mathscr {D}_{k_j,k_{j+1}}|={\mathfrak {a}}_{k_j}{\mathfrak {b}}_{k_{j+1}},\qquad k_j<k_{j+1}\quad \Rightarrow \quad |\mathscr {D}_{k_j,k_{j+1}}|={\mathfrak {a}}_{k_{j+1}}{\mathfrak {b}}_{k_j}. \end{aligned}$$

We can disregard the case \(k_j=k_{j+1}\) because then \(\mathscr {D}_{k_j,k_{j+1}}=0\) and the entire product vanishes; hence, we assume that always \(k_j\ne k_{j+1}\). We use the shorthand \(\searrow \) for \(k_j>k_{j+1}\) and \(\nearrow \) for \(k_j<k_{j+1}\). Note that once s is even then \(\mathscr {D}^{\,s}\) is symmetric and diagonal elements no longer vanish: in that case we need to consider also the case \(m=n\) but the proof is identical.

To illustrate our argument, for \(s=4\) we have eight options:

figure a

except that \(\nearrow ^3\) is impossible because \(k_0=m> n=k_2\).

We let \({\mathcal {Q}}_N\) stand for a generic polynomial of degree exactly N and note for further use the following technical result with a straightforward proof.

Proposition 6

The sum

$$\begin{aligned} \sum _{k=1}^K \frac{{\mathcal {Q}}_N(k)}{k^\alpha }\sim c K^{N-\alpha +1},\qquad K\gg 1,\quad N\ne \alpha -1, \end{aligned}$$

converges as \(K\rightarrow \infty \) if and only if \(\alpha >N+1\). Here c is a constant, while “\(\sim \)” means that we are disregarding lower-order terms.

In general, the main idea is to write a sequence of \(\nearrow \)s and \(\searrow \)s in the form

$$\begin{aligned} \nearrow ^{i_1}\searrow ^{j_1}\nearrow ^{i_2}\searrow ^{j_2}\cdots \nearrow ^{i_t}\searrow ^{j_t}, \end{aligned}$$

where \(i_k,j_k\ge 0\) and \(\sum _{k=1}^t (i_k+j_k)=s\). We call \(\nearrow ^r\) a \(\nearrow \)-pre-chain of length r and \(\searrow ^r\) a \(\searrow \)-pre-chain of length r; in other words, we decompose each product in (3.5) into a sequence of alternating pre-chains.

Consider first an \(\nearrow \)-pre-chain of length \(r\ge 1\). Because of (3.4), it equals

$$\begin{aligned}{} & {} \sum _{k_\ell =0}^{k_{\ell -1}-1} \sum _{k_{\ell +1}=0}^{k_\ell -1} \cdots \sum _{k_{\ell +r-1}=0}^{k_{\ell +r-2}-1} |\mathscr {D}_{k_{\ell -1},k_\ell } \mathscr {D}_{k_\ell ,k_{\ell +1}}\cdots \mathscr {D}_{k_{\ell +r-2},k_{\ell +r-1}}|\\{} & {} \quad =\sum _{k_\ell =0}^{k_{\ell -1}-1} \sum _{k_{\ell +1}=0}^{k_\ell -1} \cdots \sum _{k_{\ell +r-1}=0}^{k_{\ell +r-2}-1} \prod _{j=\ell -1}^{\ell +r-2} {\mathfrak {a}}_{k_j}{\mathfrak {b}}_{k_{j+1}}=\frac{{\mathfrak {a}}_{k_{\ell -1}}}{2^{r-1}} \sum _{k_\ell =0}^{k_{\ell -1}-1} \sum _{k_{\ell +1}=0}^{k_\ell -1} \cdots \sum _{k_{\ell +r-1}=0}^{k_{\ell +r-2}-1} {\mathfrak {b}}_{k_{\ell +r-1}}. \end{aligned}$$

We say that \({\mathfrak {a}}_{k_{\ell -1}}\) and \({\mathfrak {b}}_{k_{\ell +r-1}}\) are the head and the tail of the pre-chain, respectively.

Likewise, for an \(\searrow \)-pre-chain of length \(r\ge 1\) we have

$$\begin{aligned}{} & {} \sum _{\scriptscriptstyle k_\ell =k_{\ell -1}+1}^\infty \sum _{\scriptscriptstyle k_{\ell +1}=k_\ell +1}^\infty \cdots \sum _{\scriptscriptstyle k_{\ell +r-1}=k_{\ell +r-2}+1}^\infty |\mathscr {D}_{k_{\ell -1},k_\ell } \mathscr {D}_{k_\ell ,k_{\ell +1}}\cdots \mathscr {D}_{k_{\ell +r-2},k_{\ell +r-1}}|\\{} & {} \quad =\frac{{\mathfrak {b}}_{k_{\ell -1}}}{2^{r-1}} \sum _{\scriptscriptstyle k_\ell =k_{\ell -1}+1}^\infty \sum _{\scriptscriptstyle k_{\ell +1}=k_\ell +1}^\infty \cdots \sum _{\scriptscriptstyle k_{\ell +r-1}=k_{\ell +r-2}+1}^\infty {\mathfrak {a}}_{k_{\ell +r-1}}. \end{aligned}$$

Now \({\mathfrak {b}}_{k_{\ell -1}}\) and \({\mathfrak {a}}_{k_{\ell +r-1}}\) are the head and the tail of the pre-chain, respectively.

Except for \(\ell =0\) and \(\ell +r=s\), we join the tail of a pre-chain to the head of the succeeding pre-chain. The outcome is \(\nearrow \)-chains and \(\searrow \)-chains. Note thus that a chain has no head, while its tail is multiplied by the head of its successor pre-chain. (In this procedure, we lose the head of the leading pre-chain and the tail of the last pre-chain, but this makes no difference to the finiteness—or otherwise—of the sum)

An \(\nearrow \)-chain of length r is of the form

$$\begin{aligned} \frac{1}{2^{r-1}}\sum _{k_\ell =0}^{k_{\ell -1}-1} \sum _{k_{\ell +1}=0}^{k_\ell -1} \cdots \sum _{k_{\ell +r-1}=0}^{k_{\ell +r-2}-1} {\mathfrak {b}}_{k_{\ell +r-1}}^2=\frac{1}{2^{r}}\sum _{k_\ell =0}^{k_{\ell -1}-1} \sum _{k_{\ell +1}=0}^{k_\ell -1} \cdots \sum _{k_{\ell +r-1}=0}^{k_{\ell +r-2}-1} \frac{{\Gamma }(k_\ell +r+\alpha )}{(k_\ell +r-1)!}, \end{aligned}$$

a finite sum. Hence, it cannot be a source for unboundedness of the sum (3.5). Matters are different, though, with an \(\searrow \)-chain of length r: straightforward algebra and Proposition 6 imply that

$$\begin{aligned}{} & {} \sum _{k_\ell =k_{\ell -1}+1}^\infty \sum _{k_{\ell +1}=k_\ell +1}^\infty \cdots \sum _{k_{\ell +r-1}=k_{\ell -r-2}+1}^\infty {\mathfrak {a}}^2_{k_{\ell +r-1}}\\{} & {} \quad =\sum _{k_{\ell +r-1}=k_{\ell -1}+r}^\infty {\mathfrak {a}}^2_{k_{\ell +r-1}} \sum _{k_\ell =k_{\ell -1}+r-1}^{k_{\ell +r-1}-r+1} \sum _{k_{\ell +1}=k_{\ell -1}+r-2}^{k_{\ell +r-1}-r+2}\cdots \sum _{k_{\ell +r-2}=k_{\ell -1}+1}^{k_{\ell +r-1}-1} \!\!1\\{} & {} \quad =\sum _{k_{\ell +r-1}=k_{\ell -1}+r}^\infty {\mathfrak {a}}_{k_{\ell +r-1}}^2 {\mathcal {Q}}_{r-1}(k_{\ell +r-1}) \sim \sum _{\ell =k_{\ell -1}+r}^\infty \frac{1}{\ell ^{\alpha -r}}. \end{aligned}$$

Therefore, boundedness takes place if \(\alpha -r>1\).

Since the length of any chain is at most \(s-1\) and \(\searrow ^{s-1}\) is impossible (recall, \(k_0>k_s\)), the maximal length of an \(\searrow \)-chain is \(s-2\). We thus deduce that \(\alpha >s-1\).

Theorem 7

\(\text {ind}\,w_\alpha \ge s\) for the Laguerre weight if and only if \(\alpha >s-1\).

Proof

The necessity is proved in Theorem 4, while sufficiency follows because absolute boundedness in (3.5) implies boundedness. \(\square \)

Fig. 1
figure 1

Laguerre W-functions: The magnitude of \(\mathscr {D}^{\,s}\) for different values of \(\alpha \) and \(1\le s\le 3\)

In Fig. 1, we display the absolute values of the entries of \(\mathscr {D}^{\,s}\) for different values of \(\alpha \) and s. The computation involves infinite matrices, hence infinite products which need be truncated in computation. Thus, we compute \(300\times 300\) matrices and their powers, while displaying just their \(100\times 100\) section, since this minimises the truncation effects. For \(s=1\), all differentiation matrices are bounded and of a moderate size; the sole difference is that as \(\alpha \) grows, the matrix becomes more ‘centred’ about the diagonal. However, already for \(s=2\) the difference is discernible. For \(\alpha =1\), we are right on the boundary of \(\alpha >s-1\) (on its wrong side!) and the size of \(\mathscr {D}^{\,2}\) grows rapidly: had we displayed a section of an \(M\times M\) matrix for \(M\gg 300\), the magnitude would have grown at a logarithmic rate, as indicated by the proof of absolute boundedness. Once \(\alpha >1\), the magnitude grows at a slower rate and would remain bounded for \(M\rightarrow \infty \). Finally, for \(s=3\) the cases \(\alpha =1\) and \(\alpha =2\) correspond to polynomial and logarithmic growth, respectively, and this is apparent in the figure. Finally, for \(\alpha =4\) the rate of growth slows down and it is persuasive that the magnitude remains bounded as \(M\rightarrow \infty \). Note that even in a ‘good’ \(\alpha \) regime the magnitude, while decaying along rows and columns, grows along diagonals. We refer to the discussion following Fig. 2 for an explanation of this behaviour, commenting here that this phenomenon follows from \({\mathfrak {a}}_m\sim m^{-\alpha /2}\) and \({\mathfrak {b}}_n\sim n^{\alpha /2}\).

It follows from Theorem 7 that once we approximate functions in , we need to choose \(\alpha >s-1\). However, there is much more to the choice of a good \(\alpha \) and we defer its discussion to Section 4. As it turns out, the quality of approximation is exceedingly sensitive to the right choice and numerical results indicate that there exists a ‘sweet spot’ that brings about substantially improved quality of approximation.

3.2 The Ultraspherical Family

The ultraspherical weightFootnote 5 a special case of the Jacobi weight, is \(w_\alpha (x)=(1-x^2)^\alpha \chi _{(-1,1)}(x)\), \(\alpha >1\)—in our case the requirement \(w_\alpha (\pm 1)=0\) restricts \(\alpha \) to the range \((0,\infty )\). We have

$$\begin{aligned} p_n(x)= & {} g_n^\alpha \text {P}_n^{(\alpha ,\alpha )}(x),\\ \varphi _n(x)= & {} g_n^\alpha (1-x^2)^{\alpha /2} \text {P}_n^{(\alpha ,\alpha )}(x),\qquad n\in {\mathbb {Z}}_+, \end{aligned}$$

where the constant

$$\begin{aligned} g_n^\alpha =\frac{\sqrt{\frac{1}{2} n!(2n+2\alpha +1){\Gamma }(n+2\alpha +1)}}{2^\alpha {\Gamma }(n+\alpha +1)} \end{aligned}$$

orthonormalises an ultraspherical polynomial [23, p. 260]. Recalling the identity (2.3), we let

$$\begin{aligned} \mathscr {E}_{m,n}=\frac{1}{\alpha g_m^\alpha g_n^\alpha }\mathscr {D}_{m,n}=\int _{-1}^1 (1-x^2)^{\alpha -1} x\text {P}_m^{(\alpha ,\alpha )}(x)\text {P}_n^{(\alpha ,\alpha )}(x)\,\text {d}x. \end{aligned}$$

It is sufficient to derive the \(\mathscr {E}_{m,n}\)s explicitly and prove that \(\mathscr {E}\) is symmetrically separable. We recall that our interest is in odd values of \(m+n\) and assume without loss of generality that \(m\ge n+1\)

Let

$$\begin{aligned} S_{m,n}^\alpha =\int _{-1}^1 (1-x^2)^{\alpha -1}\text {P}_m^{(\alpha ,\alpha )}(x) \text {P}_n^{(\alpha ,\alpha )}(x)\,\text {d}x, \end{aligned}$$

noting that \(S_{m,n}^\alpha =0\) if \(m+n\) is odd. The three-term recurrence relation for orthonormal ultraspherical polynomials is

$$\begin{aligned} x\text {P}_m^{(\alpha ,\alpha )}(x)=\frac{(m+1)(m+2\alpha +1)}{(m+1+\alpha )(2m+2\alpha +1)} \text {P}_{m+1}^{(\alpha ,\alpha )}(x) +\frac{m+\alpha }{2m+2\alpha +1} \text {P}_{m-1}^{(\alpha ,\alpha )}(x),\qquad \end{aligned}$$
(3.6)

as can be easily confirmed from Rainville [23, p. 263]. Therefore,

$$\begin{aligned}{} & {} \mathscr {E}_{m,n}\nonumber \\{} & {} \quad =\int _{-1}^1 (1-x^2)^{\alpha -1} \left[ \frac{(m+1)(m+2\alpha +1)}{(m+1+\alpha )(2m+2\alpha +1)} \text {P}_{m+1}^{(\alpha ,\alpha )} +\frac{m+\alpha }{2m+2\alpha +1} \text {P}_{m-1}^{(\alpha ,\alpha )}\right] \! \text {P}_n^{(\alpha ,\alpha )}\,\text {d}x\nonumber \\{} & {} \quad =\frac{(m+1)(m+2\alpha +1)}{(m+1+\alpha )(2m+2\alpha +1)} S_{m+1,n}^\alpha +\frac{m+\alpha }{2m+2\alpha +1} S_{m-1,n}^\alpha . \end{aligned}$$
(3.7)

Our next task is determining the explicit form of \(S_{m,n}^\alpha \) for even \(m+n\) and, without loss of generality, \(m\ge n\). This is accomplished in Appendix B and results in

$$\begin{aligned} S_{m,n}^\alpha =\frac{4^\alpha }{\alpha } \frac{{\Gamma }(m+1+\alpha ){\Gamma }(n+1+\alpha )}{n!{\Gamma }(m+1+2\alpha )},\qquad m\ge n,\; m+n\ \text{ even }. \end{aligned}$$

We conclude from (3.7) that

$$\begin{aligned} \mathscr {E}_{m,n}=\frac{4^\alpha }{\alpha } \frac{{\Gamma }(m+1+\alpha ){\Gamma }(n+1+\alpha )}{n!{\Gamma }(m+1+2\alpha )},\qquad m\ge n,\; m+n\ \text{ odd } \end{aligned}$$

and

$$\begin{aligned} \mathscr {D}_{m,n}=\alpha g_m^\alpha g_n^\alpha \mathscr {E}_{m,n}=\frac{1}{2} \sqrt{\frac{m!(2m+2\alpha +1)(2n+2\alpha +1){\Gamma }(n+1+2\alpha )}{n!{\Gamma }(m+1+2\alpha )}} \end{aligned}$$
(3.8)

is valid for all odd \(m+n\), \(m\ge n+1\)—once \(n\ge m+1\), we need to swap m and n and invert the sign. (Of course, \(\mathscr {D}_{m,n}=0\) once \(m+n\) is even.)

Our first conclusion is that the measure is symmetrically separable with

$$\begin{aligned} {\mathfrak {a}}_m=\sqrt{\frac{m!(2m+2\alpha +1)}{2{\Gamma }(m+1+2\alpha )}},\qquad {\mathfrak {b}}_n=\sqrt{\frac{(2n+2\alpha +1){\Gamma }(n+1+2\alpha )}{2n!}}. \end{aligned}$$
(3.9)

The next conclusion is that the rate of growth (or decay) is dramatically different along the rows and the columns of \(\mathscr {D}\). It follows from (3.8) and the standard Stirling formula [21, 5.11.3] that

$$\begin{aligned} \mathscr {D}_{m,n}\sim \frac{n^{\alpha +\frac{1}{2}}}{m^{\alpha -\frac{1}{2}}},\qquad m,n\gg 1,\quad m\ge n+1. \end{aligned}$$

Therefore, the elements of the differentiation matrix decay geometrically (at any rate, for \(\alpha >\frac{1}{2}\)) along rows (and, because of skew symmetry, columns) and increase geometrically along diagonals. Note that, forming powers of \(\mathscr {D}\), it is the decay along rows and columns that allows for boundedness. (Incidentally, it can be proved using special functions that \((\mathscr {D}^{\,2})_{0,0}=\alpha (2\alpha +1)/[4(\alpha -1)]\), driving home the fact, already known from Theorem 5, that \(\alpha >1\) is necessary and sufficient for boundedness. We leave the proof, which plays no further role in our narrative, as an exercise for the reader.)

Fig. 2
figure 2

Ultraspherical W-functions: The magnitude of \(\mathscr {D}^{\,s}\) for different values of \(\alpha \) and \(1\le s\le 3\)

Figure 2 displays the magnitude of the powers of differentiation matrix for ultraspherical weights and different values of \(\alpha \) and s, using the same rules of engagement as in Fig. 1. Two trends are discernible, both following from our discussion. Firstly, the decay along rows accelerates as \(\alpha \) grows and \(\mathscr {D}^{\,s}\) is more concentrated near the diagonal. Secondly, the terms along the diagonal (of course, with \(m+n\) of the right parity) grow the fastest. Their rate of growth is rapid (and grows with \(\alpha \)), but this need not be a problem, at any rate once we approximate sufficiently smooth functions. In that case \(\mathscr {D}\), its powers and possibly functions (e.g. \(\exp (h\mathscr {D})\)) act on the expansion coefficients of functions in the underlying basis \(\Phi \). Provided these functions are sufficiently smooth, it is plausible that these coefficients decay very rapidly and, for analytic functions, at an exponential rate. (We defer to Sect. 4 for more substantive discussion of convergence.) Thus, large terms along the diagonal will multiply small terms in a vector of expansion coefficients—something that might conceivably cause loss of accuracy for truly huge matrices but which is probably negligible in practice.

Similar to Laguerre weights, we now seek to prove that Theorem 4 provides also a sufficient condition for \(\text {ind}\,w_\alpha \ge s\) for ultraspherical weights, i.e. that \(\alpha >s-1\) implies that \(\mathscr {D}^{\,s}\) is bounded. Our method of proof is similar to that of Theorem 7, except that we need to account for a number of differences: firstly \(\mathscr {D}_{m,n}\) can be nonzero only when \(m+n\) is even, secondly, we have symmetric separability in place of separability, and thirdly, (3.4) is no longer true and needs to be replaced by

$$\begin{aligned} {\mathfrak {a}}_m{\mathfrak {b}}_m=m+\alpha +\frac{1}{2}. \end{aligned}$$
(3.10)

Letting again \(k_0=m\), \(k_s=n\), where \(m\ge n+1\) (or \(m\ge n\) once s is even), we need to replace (3.5) by

$$\begin{aligned} {\sum _{k_1=0}^\infty }^{\!\star } {\sum _{k_2=0}^\infty }^{\!\star } \cdots {\sum _{k_{s-1}=0}^\infty }^{\!\!\!\star } |\mathscr {D}_{k_0,k_1}\mathscr {D}_{k_1,k_2}\cdots \mathscr {D}_{k_{s-2},k_{s-1}} \mathscr {D}_{k_{s-1},k_s}|<\infty , \end{aligned}$$

where the star means that we sum only over pairs \((k_{i-1},k_{i})\) such that \(k_{i-1}+k_{i}\) is odd. Note that Proposition 6 remains true for the ‘starred sum’ except that the constant (of which we care little) is different.

We again commence with \(\nearrow \)- and \(\searrow \)-pre-chains. Little changes for a \(\searrow \) chain, since the sum remains finite. The only possible challenge to boundedness may originate in a \(\nearrow \) chain. We analyse a \(\nearrow \)-pre-chain using (3.10),

$$\begin{aligned}{} & {} {\sum _{k_\ell =k_{\ell -1}+1}^\infty }^{\star } {\sum _{k_{\ell +1}=k_{\ell }+1}^\infty }^{\star } \cdots {\sum _{k_{\ell +r-1}=k_{\ell +r-2}+1}^\infty }^{\star } |\mathscr {D}_{k_\ell ,k_{\ell -1}} \mathscr {D}_{k_{\ell +1},k_\ell } \cdots \mathscr {D}_{k_{\ell +r-1},k_{\ell +r-2}}|\\{} & {} \quad ={\sum _{k_\ell =k_{\ell -1}+1}^\infty }^{\star } {\sum _{k_{\ell +1}=k_{\ell }+1}^\infty }^{\star } \cdots {\sum _{k_{\ell +r-1}=k_{\ell +r-2}+1}^\infty }^{\star } \prod _{j=\ell }^{\ell +r-2}{\mathfrak {a}}_{k_j}{\mathfrak {b}}_{k_{j-1}}\\{} & {} \quad ={\mathfrak {b}}_{k_{\ell -1}} {\sum _{k_\ell =k_{\ell -1}+1}^\infty }^{\star } \left( k_\ell +\alpha +\frac{1}{2}\right) {\sum _{k_{\ell +1}=k_{\ell }+1}^\infty }^{\star } \left( k_{\ell +1}+\alpha +\frac{1}{2}\right) \cdots {\sum _{k_{\ell +r-1}=k_{\ell +r-2}+1}^\infty }^{\star } {\mathfrak {a}}_{k_{\ell +r-1}}, \end{aligned}$$

and, using Proposition 6, transition seamlessly to a \(\nearrow \)-chain, while disregarding lower-order terms,

$$\begin{aligned}{} & {} {\sum _{k_\ell =k_{\ell -1}+1}^\infty }^{\star } \left( k_\ell +\alpha +\frac{1}{2}\right) {\sum _{k_{\ell +1}=k_{\ell }+1}^\infty }^{\star } \hspace{2pt} \left( k_{\ell +1}+\alpha +\frac{1}{2}\right) \cdots {\sum _{k_{\ell +r-1}=k_{\ell +r-2}+1}^\infty }^{\star } {\mathfrak {a}}_{k_{\ell +r-1}}^2\\{} & {} \quad \sim {\sum _{k_{\ell +r-1}=k_{\ell -1}+r}^\infty }^{\star } {\mathfrak {a}}_{k_{\ell +r-1}}^2 {\sum _{k_\ell =k_{\ell -1}+r-1}^{k_{\ell +r-1}-r+1}}^{\star } k_\ell {\sum _{k_{\ell +1}=k_{\ell -1}+r-2}^{k_{\ell +r-1}-r+2}}^{\star }\ k_{\ell +1} \cdots {\sum _{k_{\ell +r-2}=k_{\ell -1}+1}^{k_{\ell +r-1}-1}}^{\star }\hspace{6pt} k_{\ell +r-2}\\{} & {} \quad \sim {\sum _{k_{\ell +r-1}=k_{\ell -1}+r}^\infty }^{\star } {\mathfrak {a}}_{k_{\ell +r-1}}^2 {\mathcal {Q}}_{2r-2}(k_{\ell +r-1}) \sim {\sum _{k=k_{\ell -1}+r}^\infty }^{\star } \frac{1}{k^{2\alpha -2r+1}}. \end{aligned}$$

Consequently, \(\alpha >r\) is necessary and sufficient for convergence for each \(\nearrow \)-chain. Since the length of an \(\nearrow \)-chain is at most \(s-1\), we deduce, similar to Theorem 7, that

Theorem 8

\(\text {ind}\,w_\alpha \ge s\) for the ultraspherical weight if an only if \(\alpha >s-1\).

Both Theorems 7 and 8 present the same inequality. This is not surprising since, for both weights, \(\alpha \) measures the ‘strength’ of a zero at the endpoint(s).

We conclude this subsection with plots displaying the \({\ell }_2\) norms of truncated \(N\times N\) differentiation matrices for both Laguerre and ultraspherical cases for \(\alpha \in \{1,2,3,4\}\).

Fig. 3
figure 3

\(\ell _2\) norms of \(N\times N\) differentiations matrices as N grows in multiples of ten. The top curve corresponds to \(\alpha =1\), underneath \(\alpha =2\), then \(\alpha =3\), and the bottom curve corresponds to \(\alpha =4\)

As can be seen from Fig. 3, the \(\ell _2\) norm of an \(N\times N\) principal minor of \(\mathscr {D}\) increases linearly with N for Laguerre, quadratically for the ultraspherical weight. This has obvious implications, inter alia to the implementation of Krylov subspace methods in the manipulation of differentiation matrices. Having said so, Krylov subspace methods may be problematic in this context [10]. In Sect. 4, we outline alternative approaches to numerical algebra of separable differentiation matrices.

3.3 Counterexamples: generalised Hermite and Konoplev weights

Ultraspherical and Laguerre weights are the obvious and most elementary choice in the intervals \((-1,1)\) and \((0,\infty )\), respectively, and they are both separable in the sense of this paper. This might lead to an impression that separability is ubiquitous: this would be highly misleading.

Lemma 9

Let

$$\begin{aligned} \iota _{m,n}= & {} \mathscr {D}_{m,n}\mathscr {D}_{m+1,n+1}-\mathscr {D}_{m+1,n}\mathscr {D}_{m,n+1}, \end{aligned}$$
(3.11)
$$\begin{aligned} {\check{\iota }}_{m,n}= & {} \mathscr {D}_{m,n}\mathscr {D}_{m+2,n+2}-\mathscr {D}_{m+2,n}\mathscr {D}_{m,n+2}. \end{aligned}$$
(3.12)

Separability implies that \(\iota _{m,n}=0\) for all \(m\ge n+2\), while symmetric separability implies that \({\check{\iota }}_{m,n}=0\) for all \(m+n\) odd, \(m\ge n+2\).

Proof

Follows at once from the definition of (symmetric) separability. \(\square \)

Note that neither (3.11) nor (3.12) are sufficient. Thus, a skew-symmetric \(\mathscr {D}\) such that \(\mathscr {D}_{2m+1,n}=0\) for all \(m\in {\mathbb {Z}}_+\) and \(n\le 2m-1\) satisfies (3.11), but in general is not separable. Likewise, a tridiagonal skew-symmetric matrix obeys (3.12) but is not symmetrically separable—this is the case with the differentiation matrix associated with the Hermite weight, for example. Trying weights at random and computing, say, \(\iota _{2,0}\) leads time and again to non-separable weights.

To explore further the (non)existence of separable weights, we examine two weights, generalisations of Hermite and ultraspherical weights, respectively, but endowed with an additional parameter: the generalised Hermite and Konoplev weights.

3.3.1 Generalised Hermite weights

Letting \(\mu >-\frac{1}{2}\), we examine the weight

$$\begin{aligned} w_\mu (x)=|x|^{2\mu } \textrm{e}^{-x^2},\qquad x\in {\mathbb {R}} \end{aligned}$$
(3.13)

[5, p. 156], originally considered by Szegő.Footnote 6 It can be easily deduced from Chihara [5, p. 156–7] that the underlying W-functions are

$$\begin{aligned} \varphi _{2n}(x)= & {} (-1)^n 2^n \sqrt{\frac{n!}{{\Gamma }(n+\mu +\frac{1}{2})}} \text {L}_n^{(\mu -\frac{1}{2})}(x^2)|x|^\mu \textrm{e}^{-x^2/2},\\ \varphi _{2n+1}(x)= & {} (-1)^n 2^{n+1}\sqrt{\frac{n!}{2{\Gamma }(n+\mu +\frac{3}{2})}} x\text {L}_n^{(\mu +\frac{1}{2})}(x^2)|x|^\mu \textrm{e}^{-x^2/2},\qquad n\in {\mathbb {Z}}_+. \end{aligned}$$

Generalised Hermite weights are of marginal importance to the work of this paper, and although their differentiation matrix can be derived explicitly,

$$\begin{aligned} \mathscr {D}_{2n,2n-1}= & {} \sqrt{n},\qquad \mathscr {D}_{2m,2n-1}=0,\quad m\ge n+1,\\ \mathscr {D}_{2n+1,2n}= & {} \frac{2n+1}{2\sqrt{n+\mu +\frac{1}{2}}},\quad \mathscr {D}_{2m+1,2n}=(-1)^{m+n-1}\mu \sqrt{\frac{m!}{(n+\mu +\frac{1}{2})_{m+1-n}}}, \end{aligned}$$

where \((z)_n=z(z+1)\cdots (z+n-1)\) is the Pochhammer symbol, with skew-symmetric complement, we will not present here a formal (and lengthy) algebra. Instead, a reader might use a symbolic algebra package to compute the first few elements, enough to evaluate \(\iota _{2,0}\) and \({\check{\iota }}_{3,0}\) and check that they are both nonzero—in light of Lemma 9 this is sufficient to rule out separability and symmetric separability, respectively.

As a matter of fact, \(\mathscr {D}\) has an interesting shape: its \((2m+1)\)st columns (hence also the \((2n+1)\)st rows) are consistent with a tridiagonal matrix, more specifically with the differentiation matrix corresponding to the standard Hermite weight (i.e. with \(\mu =0\)). More specifically,

$$\begin{aligned} \iota _{2n,2n-1}=\left( n+\frac{1}{2}\right) \sqrt{\frac{n}{n+\mu +\frac{1}{2}}}\ne 0,\qquad \iota _{2n+1,2n}=\left( n+\frac{1}{2}\right) \sqrt{\frac{n+1}{n+\mu +\frac{1}{2}}}\ne 0, \end{aligned}$$

otherwise \(\iota _{m,n}=0\) for \(m\ge n+2\), while

$$\begin{aligned} {\check{\iota }}_{2n+3,2n}= & {} \frac{\mu \sqrt{(n+2)!} [\mu \sqrt{(n+1)!}+n+\frac{3}{2}]}{(n+\mu +\frac{3}{2})\sqrt{(n+\mu +\frac{1}{2})(n+\mu +\frac{5}{2})}} \ne 0,\\ {\check{\iota }}_{2m+4,2n+1}= & {} 0,\qquad m\ge n+2 \end{aligned}$$

In each case, the separability tests (3.11) and (3.12) fail only marginally—but fail nonetheless.

3.3.2 Konoplev weights

Letting \(\alpha ,\gamma >-1\), we set

$$\begin{aligned} w_{\alpha ,\gamma }(x)=|x|^{2\gamma +1}(1-x^2)^\alpha ,\qquad x\in (-1,1). \end{aligned}$$
(3.14)

The weight (3.14), which has been considered in [16, 17] and described in Chihara [5, p. 155], generalises ultraspherical weights by adding the possible weakly singular factor \(|x|^{2\gamma +1}\). Specifically, \(w_{\alpha ,\gamma }\in \text {C}^s(-1,1)\) if and only if

$$\begin{aligned} \text{ either }\qquad \gamma \in \left\{ \frac{k}{2}\,:\, k\in \{-1,0,\ldots ,s-2\}\right\} \qquad \text{ or }\qquad \gamma >\frac{s-1}{2}. \end{aligned}$$

The underlying orthogonal polynomial system is

$$\begin{aligned} \text {S}_{2n}(x)=\text {P}_n^{(\alpha ,\gamma )}(2x^2-1),\qquad \text {S}_{2n+1}(x)=x\text {P}_n^{(\alpha ,\gamma +1)}(2x^2-1),\qquad n\in {\mathbb {Z}}_+, \end{aligned}$$

and the monic polynomials obey the three-term recurrence relation

$$\begin{aligned} \hat{\text {S}}_{n+1}(x)=x\hat{\text {S}}_n(x)-c_n \hat{\text {S}}_{n-1}(x), \end{aligned}$$

where

$$\begin{aligned} c_{2n}=\frac{n(n+\alpha )}{(2n+\alpha +\gamma )(2n+1+\alpha +\gamma )},\qquad c_{2n+1}=\frac{(n+1+\gamma )(n+1+\alpha +\gamma )}{(2n+\alpha +\gamma )(2n+1+\alpha +\gamma )}. \end{aligned}$$

Replacing Jacobi polynomials by their orthonormal counterparts and using a formula from Rainville [23, p. 260], easy algebra confirms that

$$\begin{aligned} \kappa _{2m}^{\alpha ,\gamma }= & {} \int _{-1}^1 w_{\alpha ,\gamma }(x) S_{2m}^2(x)\,\text {d}x=\frac{(m+1+\alpha +\gamma )_m {\Gamma }(m+1+\alpha ){\Gamma }(m+1+\gamma )}{m!{\Gamma }(2m+2+\alpha +\gamma )},\\ \kappa _{2m+1}^{\alpha ,\gamma }= & {} \int _{-1}^1 w_{\alpha ,\gamma }(x) S_{2m+1}^2(x)\,\text {d}x=\frac{(m+2+\alpha +\gamma )_m{\Gamma }(m+1+\alpha ){\Gamma }(m+2+\gamma )}{m!{\Gamma }(2m+3+\alpha +\gamma )}, \end{aligned}$$

therefore

$$\begin{aligned} \varphi _{2m}(x)= & {} \frac{|x|^{\gamma +\frac{1}{2}}(1-x^2)^{\alpha /2}}{\sqrt{\kappa _{2m}^{\alpha ,\gamma }}} \text {P}_m^{(\alpha ,\gamma )}(2x^2-1),\\ \varphi _{2m+1}(x)= & {} \frac{x|x|^{\gamma +\frac{1}{2}}(1-x^2)^{\alpha /2}}{\sqrt{\kappa _{2m+1}^{\alpha ,\gamma }}}\text {P}_m^{(\alpha ,\gamma +1)}(2x^2-1). \end{aligned}$$

The weights (3.14) are symmetric; thus, we examine the possibility of symmetric separability. A brute force computation yields

$$\begin{aligned} {\check{\iota }}_{3,0}=(5+\alpha +\gamma )(2\gamma +1) \sqrt{\frac{(4+\alpha +\gamma )(6+\alpha +\gamma )}{2(1+\alpha )(2+\alpha )(1+\gamma )(3+\gamma )}}, \end{aligned}$$

ruling out symmetric separability except for the case \(\gamma =-\frac{1}{2}\), which corresponds to the ultraspherical weight.

3.3.3 A Limiting Behaviour of the \(\iota _{m,n}\)s

While separability, hence \(\iota _{m,n}=0\) for \(m\ge n+2\), appears to be exceedingly rare, we claim that the latter holds more broadly in a much weaker, asymptotic form.

Let w be a weight in (ab), \(w(a)=w(b)=0\), with the underlying orthonormal polynomials \(\{p_n\}_{n=0}^\infty \), where the coefficient of \(x^n\) in \(p_n\) is \(k_n>0\). Comparing the coefficients of \(x^{n+1}\) in the three-term recurrence relation (1.3), we deduce at once that \(k_{n+1}/k_n=\beta _n^{-1}\).

Theorem 10

Assuming that \(\beta _n\ge \beta ^*>0\) and \({\tilde{w}}(x)=[w'(x)]^2/w(x)\) is itself a weight function in (ab), it is true that

$$\begin{aligned} \lim _{m\rightarrow \infty }\iota _{m,n}=0,\qquad n\in {\mathbb {Z}}_+. \end{aligned}$$
(3.15)

Proof

Letting \(m\ge n+2\), (2.3) yields

$$\begin{aligned} \iota _{m,n}= & {} \frac{1}{4} \int _a^b w'(y) p_m(y)p_n(y)\,\text {d}y \int _a^b w'(x) p_{m+1}(x) p_{n+1}(x)\,\text {d}x\\{} & {} -\frac{1}{4} \int _a^b w'(x) p_{m+1}(x)p_n(x)\,\text {d}x\int _a^b w'(y) p_m(y) p_{n+1}(y)\,\text {d}y\\= & {} \frac{1}{4} \int _a^b \int _a^b w'(x)w'(y) p_{m+1}(x)p_m(y)[ p_{n+1}(x)p_n(y)-p_n(x)p_{n+1}(y)]\,\text {d}x\,\text {d}x. \end{aligned}$$

We recall the Christoffel–Darboux formula,

$$\begin{aligned} \sum _{\ell =0}^n p_\ell (x)p_\ell (y)=\frac{k_n}{k_{n+1}} \frac{p_{n+1}(x)p_n(y)-p_n(x)p_{n+1}(y)}{x-y}, \end{aligned}$$

where \(k_n>0\) is the coefficient of \(x^n\) in \(p_n\) [5, p. 153]. Therefore,

$$\begin{aligned} \iota _{m,n}= & {} \frac{k_{n+1}}{k_n}\int _a^b \int _a^b w'(x)w'(y) p_{m+1}(x) p_m(y) (x-y) \sum _{\ell =0}^n p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y\nonumber \\\le & {} \frac{1}{\beta ^*} \left| \int _a^b \int _a^b w'(x)w'(y) p_{m+1}(x) p_m(y) (x-y) \sum _{\ell =0}^n p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y\,\right| \!,\qquad \end{aligned}$$
(3.16)

because \(\beta _n^{-1}\le {\beta ^*}^{-1}\). Letting \(n\rightarrow \infty \) in (3.16), we obtain

$$\begin{aligned} \lim _{n\rightarrow \infty }\iota _{m,n}\le & {} \frac{1}{\beta ^*}\left| \int _a^b \int _a^b w'(x)w'(y)p_{m+1}(x)p_m(x) (x-y) \sum _{\ell =0}^\infty p_\ell (x)p_\ell (y) \,\text {d}x\,\text {d}y\right| \\= & {} \frac{1}{\beta ^*}\left| \int _a^b \int _a^b \frac{w'(x)w'(y)}{\sqrt{w'x)w(y)}} p_{m+1}(x)p_m(x) (x-y) K(x,y)\,\text {d}x\,\text {d}y\right| , \end{aligned}$$

where K is tha Christoffel–Darbeaux kernel from the proof of Theorem 5. According to (2.8), it is a reproducing kernel and it follows at once that the double integral vanishes. \(\square \)

The condition \(\beta _n>\beta ^*\), \(n\in {\mathbb {Z}}_+\), is very weak: we already know that \(\beta _n>0\), all the condition says is that, in addition, the \(\beta _n\)s are bounded away from zero.

4 Computational Aspects

4.1 A Product of \(\mathscr {D}\) by a Vector

Practical implementation of the ideas of this paper requires manipulation of expressions involving a matrix \(\mathscr {D}\) which is either separable or symmetrically separable: formation of products of the form \(\mathscr {D}^{\,r}\varvec{f}\) for \(r\in {\mathbb {N}}\), the solution of algebraic linear systems of the form \(p(\mathscr {D})\varvec{y}=\varvec{x}\), where p is a polynomial and the computation of \(\textrm{e}^{h\mathscr {D}}\varvec{u}\). Tridiagonal differentiation matrices, of the form considered in [11], enjoy substantial advantage in this context. Yet, once a weight function is separable (or symmetrically separable), all these objectives can be attained using fast algorithms. The matrix \(\mathscr {D}\) is a special case of a semiseparable matrix [7, p. 50]: all its minors located either wholly above the diagonal or wholly beneath it are of rank 1. This allows for fast products and fast computation of linear systems [6, 7, 28] and, using the Cauchy–Dunford integral formula (also known as the Dunford–Taylor or Riesz–Fantappié formula, [4]), compute \(\textrm{e}^{h\mathscr {D}}\varvec{u}\).

In this subsection, we examine in detail the formation of products of the form \(\varvec{h}=\mathscr {D}\,\varvec{f}\), where \(\varvec{f}\) is a (real or complex) infinite-dimensional vector, while \(\mathscr {D}\) is either separable or symmetrically separable. While the main idea is not new—cf. for example, [6]—there is merit in presenting it for the convenience of the reader, as an elementary example of more advanced numerical algebra computations in [6, 7, 28]. As a matter of fact, our algorithm is somewhat more general, because it is based on infinite-dimensional computations.

Consider a separable weight function, e.g. a Laguerre weight. The starting point is an integer N, typically much larger than M, such that \(|f_m|\) is negligible (in practical terms, smaller than a user-provided error tolerance) for \(m>N\), and we wish to form

$$\begin{aligned} h_m=\sum _{n=0}^N \mathscr {D}_{m,n} f_n,\qquad m=0,\ldots ,M. \end{aligned}$$
(4.1)

We commence by assuming that a weight is separable, whereby (3.1) yields

$$\begin{aligned} h_m=- {\mathfrak {a}}_m \sum _{n=0}^{m-1} {\mathfrak {b}}_n f_n + {\mathfrak {b}}_m\!\! \sum _{n=m+1}^N {\mathfrak {a}}_n f_n=\sigma _m+\rho _m,\qquad m=0,\ldots ,M, \end{aligned}$$

where

$$\begin{aligned} \sigma _m=\sum _{n=0}^{m-1} {\mathfrak {b}}_n f_n,\quad \rho _m=\sum _{n=m+1}^N {\mathfrak {a}}_n f_n,\qquad m=0,\ldots ,M. \end{aligned}$$

Then

$$\begin{aligned} h_0= & {} {\mathfrak {b}}_0 \rho _0,\\ h_m= & {} -{\mathfrak {a}}_m\sigma _m+{\mathfrak {b}}_m\rho _m,\qquad m=1,\ldots ,M,\\{} & {} \text{ where }\quad \sigma _m=\sigma _{m-1}+{\mathfrak {b}}_{m-1}f_{m-1},\quad \rho _m=\rho _{m-1}-{\mathfrak {a}}_m f_m. \end{aligned}$$

Assuming that the \({\mathfrak {a}}_m\)s and \({\mathfrak {b}}_n\)s have been precomputed (and this needs to be done only once, no matter how many products are required), the calculation (4.1) takes just \(\approx N+4M\) flops—and by the same token, computing the first \(M+1\) entries of \(\mathscr {D}_N^{\,r}\!\varvec{f}_{\!N}\) takes \(\approx r(N+4M)\) flops.

Similar operations count applies to symmetrically separable weight, whereby the entries of \(\mathscr {D}\) obey (3.2). Assuming that both M and N are even, we have

$$\begin{aligned} h_{2m}= & {} \sum _{n=0}^{N/2} \mathscr {D}_{2m,2n+1} f_{2n+1},\\ h_{2m+1}= & {} \sum _{n=0}^{N/2} \mathscr {D}_{2m+1,2n} f_{2n},\qquad m=0,\ldots ,\frac{M}{2}. \end{aligned}$$

Therefore

$$\begin{aligned} h_{2m}= & {} {\mathfrak {a}}_{2m}\sum _{n=0}^{m-1} {\mathfrak {b}}_{2n+1} f_{2n+1}-{\mathfrak {b}}_{2m}\sum _{n=m+1}^{N/2} {\mathfrak {a}}_{2n+1} f_{2n+1},\\ h_{2m+1}= & {} {\mathfrak {a}}_{2m+1} \sum _{n=0}^{m-1} {\mathfrak {b}}_{2n} f_{2n}-{\mathfrak {b}}_{2m+1}\sum _{n=m+1}^{N/2} {\mathfrak {a}}_{2n} f_{2n},\qquad m=0,\ldots ,\frac{M}{2}. \end{aligned}$$

Set

$$\begin{aligned} \sigma _m^{\text {E}}&\displaystyle =\sum _{n=0}^{m-1} {\mathfrak {b}}_{2n} f_{2n},\qquad&\sigma _m^{\text {O}}=\sum _{n=0}^{m-1} {\mathfrak {b}}_{2n+1}f_{2n+1},\\ \rho _m^{\text {E}}&\displaystyle =\sum _{n=m+1}^{N/2} {\mathfrak {a}}_{2n} f_{2n},\qquad&\rho _m^{\text {O}}=\sum _{n=m+1}^{N/2} {\mathfrak {a}}_{2n+1} f_{2n+1}, \end{aligned}$$

hence

$$\begin{aligned} h_{2m}={\mathfrak {a}}_{2m}\sigma _m^{\text {O}}-{\mathfrak {b}}_{2m} \rho _m^{\text {O}},\qquad h_{2m+1}={\mathfrak {a}}_{2m+1} \sigma _m^{\text {E}}-{\mathfrak {b}}_{2m+1} \rho _m ^{\text {E}},\qquad m=0,\ldots ,\frac{M}{2}. \end{aligned}$$

However,

$$\begin{aligned} \sigma _0^{\text {E}}=\sigma _0^{\text {O}}=0,\qquad \rho _0^{\text {E}} =\sum _{m=1}^{N/2} {\mathfrak {a}}_{2n}f_{2n},\qquad \rho _0^{\text {O}}=\sum _{m=1}^{N/2} {\mathfrak {a}}_{2n+1} f_{2n+1}, \end{aligned}$$

and

figure b

Thus, again, we need just \(\approx 4M+N\) flops to compute the first \(M+1\) terms of \(\mathscr {D}_N\varvec{f}_{\!N}\).

4.2 The Speed of Convergence

The convergence of orthogonal polynomials to ‘nice’ (in particular, analytic) functions is well understood, and this can be leveraged to the case of W-functions. It is beneficial first, though, to present some computational results, to highlight the importance of choosing the right value of \(\alpha \) in the context of either Laguerre or ultraspherical weights, while comparing them to standard approximation by the underlying orthogonal polynomials.

It rapidly becomes apparent that we have a competition between different imperatives:

  • The number of zero boundary conditions: This determines the value of \(\alpha \) and, according to Theorems 7 and 8, we need \(\alpha >s-1\) in .

  • Regularity of approximating functions: While \(\mathscr {P}\) consists of polynomials, hence analytic functions, this is not the case with \(\Phi \), whether in the context of ultraspherical or Laguerre weights: it all depends on the value of \(\alpha \). If \(\alpha \) is an even integer, then the \(\varphi _n\)s are analytic; otherwise, analyticity fails at the endpoints.

  • The underlying function space: Much depends on how the error is measured. Among the many possibilities, we single out two: the norm for a suitable value of p (in particular, the \(\text {L}_2(a,b)\) norm) and the \(\text {L}_\infty [a,b]\) (and, more generally, \(\text {H}_\infty ^p[a,b]\)) norm. The choice of a norm depends on the underlying application.

Preliminary numerical experimentation reveals a remarkable state of affairs. In Fig. 4, we let \(\alpha \) be in \(\{1,2,3,4\}\). In this and all figures in this paper, we denote \(\alpha =1\) by a red, dotted line, \(\alpha =2\) by a magenta solid line, \(\alpha =3\) by a green dashed line and, finally, \(\alpha =4\) by a blue dash-dotted line. Because of the rapid decay of errors, we display them all in a logarithmic scale to base 10—in other words, the y-axis displays the number of decimal digits. Given a function f and recalling the expansion coefficients \({\hat{f}}_n^P\) and \({\hat{f}}^\Phi _n\) from Remark 1, corresponding to expansions in \(\mathscr {P}\) and \(\Phi \), respectively, we let

$$\begin{aligned} F^P_N(x)=\sum _{n=0}^N {\hat{f}}_n^P p_n(x),\quad F^\Phi _N(x)=\sum _{n=0}^N {\hat{f}}_n^\Phi \varphi _n(x),\qquad N\in {\mathbb {Z}}_+. \end{aligned}$$

Thus, \(F^P_N-f\) and \(F^\Phi _N-f\) are the (pointwise) errors with respect to the polynomial and the W-function basis, respectively, and we need to measure them in an appropriate norm. We denote by \({}^d\!F_N^P\) the derivative expansion, i.e. with \(p_n\) and f replaced by \(p_n'\) and \(f'\), respectively, similarly for higher derivatives and for \(F^\Phi _N\).

4.2.1 Ultraspherical W-Functions

We commence from ultraspherical weights and consider

(4.2)
Fig. 4
figure 4

Ultraspherical W-functions: The errors \(\log _{10}\Vert F_N^P-f\Vert _\infty \) (top left), \(\log _{10}\Vert {}^d\!F_N^P-f'\Vert _\infty \) (top right), \(\log _{10}\Vert F_N^\Phi -f\Vert _\infty \) (bottom left) and \(\log _{10}\Vert {}^d\!F_N^\Phi -f'\Vert _\infty \) (bottom right) for \(\alpha =1,2,3,4\) and \(N=1,2,\ldots ,30\) (except that in the bottom-right plot only \(\alpha =2\) is displayed)

In Fig. 4, we display in logarithmic norm the \(\text {L}_\infty [-1,1]\) error for polynomial approximation to f and its first derivative (top row) and for W-functions for the ultraspherical weight.Footnote 7 Polynomial approximation—as can be expected from general theory and the analyticity of f—decays at an exponential speed and, for \(N=30\), we attain \(\approx 32\) significant digits. This is also the case with derivatives, with a very minor degradation in accuracy. The error for W-functions, though, is radically different. The errors for \(\alpha \in \{1,3,4\}\) decay very slowly, at a polynomial rate, and for \(N=30\) we recover just \(\approx 4\) significant digits, an unacceptably large error. On the other hand, the error for \(\alpha =2\) at \(N=30\) is \(\approx 3\times 10^{-39}\), significantly better than polynomial approximation!

The reason for this ‘miraculous’ behaviour for \(\alpha =2\) bears some attention. Little surprise perhaps that \(\alpha =1\) behaves poorly because it is at the wrong end of the boundedness condition for \(\mathscr {D}^{\,2}\). However, as a matter of fact, we do not consider second derivatives in this particular instance and \(\alpha \in \{3,4\}\) are just as bad. The reasons are as follows. For \(\alpha \in \{1,3\}\), the \(\varphi _n\)s have a weak singularity along the boundary, while \(\varphi _n'\) becomes singular there. For \(\alpha =4\), on the other hand, \(\varphi _n'(\pm 1)=0\) mean that \(\text {L}_\infty \) convergence of derivatives is impossible unless also the derivatives of f vanish at the endpoints. (This is the reason why \(\log _{10}\Vert {}^d\!F_N^\Phi -f'\Vert _\infty \) is displayed only for \(\alpha =2\).)

Fig. 5
figure 5

Ultraspherical W-functions: The errors \(\log _{10}\Vert F_N^P-f\Vert _2\) (left) and \(\log _{10}\Vert F_N^\Phi -f\Vert _2\) (right) for \(\alpha =1,2,3,4\)

Not much changes if, instead of \(\text {L}_\infty \), we compute an error, except that in general \(\text {L}_2\)-like norms are more forgiving. In principle, neither singularities or excessive vanishing of derivatives at the endpoints need prevent convergence. Thus, in Fig. 5 we plot the \(\text {L}_2(-1,1)\) errors for example (4.2). The overall picture remains the same: polynomial approximation decays at exponential rate and we attain, regardless of the choice of \(\alpha \), about 34 significant digits for \(N=30\), while W-function approximation for \(\alpha \in \{1,3,4\}\) is very poor yet, for \(\alpha =2\), we again hit the ‘sweet spot’ and recover \(\approx 38\) significant digits. W-functions are vastly superior for \(\alpha =2\) but fail dismally otherwise.

To explore further the error committed by ultraspherical W-functions, we consider

$$\begin{aligned} f(x)=(1-2x)\cos ^2\frac{\pi x}{2} \end{aligned}$$
(4.3)

the only difference in this (not very imaginative!) choice is that now \(f(\pm 1)=f'(\pm 1)=0\). We display the \(\text {L}_\infty \) error for \(f^{(i)}\), \(i=0,1,2\), in Fig. 6 for the W-functions. The error in polynomial approximation is roughly independent of \(\alpha \), and for \(N=30\) we attain \(\approx 24\) decimal digits for f, \(\approx 21\) for \(f'\) and \(\approx 19\) for \(f''\). By this stage, we should not be surprised that \(\alpha =1\) and \(\alpha =3\) do badly in approximating f because of the weak singularity at the endpoints and they fail altogether approximating derivatives. For \(\alpha \in \{2,4\}\), the endpoints are analytic and indeed the underlying functions do very well indeed, definitely better than polynomial approximation. \(\alpha =4\) is a winner, unsurprisingly because and this is matched by \(\Phi \). However, \(\alpha =2\) does quite well, worse by perhaps two decimal digits but still beating polynomial approximation. The reason is that too few zero Dirichlet boundary conditions do not prevent \(\text {L}_\infty \) convergence of an orthogonal sequence, although they might slow it up to a modest extent. On the other hand, excessive zero Dirichlet boundary conditions prevent \(\text {L}_\infty \) convergence at the endpoints. Thus, the interplay between the number of zero boundary conditions and the choice of \(\alpha \) is not symmetric! It is always better to err by choosing smaller \(\alpha \), as long as it is an even integer, consistent with the bound of Theorem 8.

Fig. 6
figure 6

Ultraspherical W-functions: The errors \(\log _{10}\Vert F_N^\Phi -f\Vert _2\) (left) \(\log _{10}\Vert {}^d\!F_N^\Phi -f'\Vert _2\) (centre) and \(\log _{10}\Vert {}^{dd}\!F_N^\Phi -f''\Vert _2\) (right) for \(\alpha =1,2,3,4\) and the function (4.3)

4.2.2 Laguerre W-Functions

We are now concerned with the Laguerre weight and choose the model problem

$$\begin{aligned} f(x)=\textrm{e}^{-x}\sin x,\qquad x\ge 0. \end{aligned}$$
(4.4)

Note that \(f(0)=0\), \(f'(0)\ne 0\).

Fig. 7
figure 7

Laguerre W-functions: The errors \(\log _{10}|F_{40}^P(x)-f(x)|\) (left) and \(\log _{10}|F_{40}^\Phi (x)-f(x)|\) (right) for \(x\in [0,30]\) and \(\alpha =1,2,3,4\)

An expansion in Laguerre (or any other) polynomials cannot be bounded in an infinite interval; hence, instead of plotting \(\log _{10}\Vert F_N^P-f\Vert _2\) for increasing values of N, we choose \(N=40\) and plot the pointwise error in the interval [0, 30]. This is evident on the left of Fig. 7: the error is just about fine for small \(x>0\), subsequently growing rapidly (as a matter of fact, exponentially). On the other hand, as can be seen on the right of that figure, the error of W-functions is uniformly bounded. For \(\alpha \in \{1,3,4\}\), it is fairly similar—and unacceptably large—while for \(\alpha =2\) we attain \(\approx 10\) decimal digits of accuracy, apparently uniformly in \([0,\infty )\). Yet again we have the ‘sweet spot’ for \(\alpha =2\). This state of affairs remains true for the first few derivatives, and the deterioration in accuracy using W-functions is very mild indeed.

Finally, we consider

$$\begin{aligned} f(x)=\textrm{e}^{-x}\sin ^2x,\qquad x\ge 0. \end{aligned}$$
(4.5)

Now \(f(0)=f'(0)=0\) and \(f''(0)\ne 0\). There is no need to display the \(\text {L}_\infty [0,\infty )\) error committed by Laguerre polynomials since, again, it is unbounded.

Fig. 8
figure 8

Laguerre W-functions: The errors \(\log _{10}|F_{60}^\Phi (x)-f(x)|\) (left) \(\log _{10}|{}^d\!F_{60}^\Phi (x)-f'(x)|\) (right) and \(\log _{10}|{}^{dd}\!F_{60}^\Phi (x)-f''(x)|\) for the function (4.5), \(x\in [0,30]\) and \(\alpha =1,2,3,4\)

In Fig. 8, we employ the same colour and style scheme to plot the errors committed in [0, 30] for f, \(f'\) and \(f''\). Clearly, \(\alpha =2\) and \(\alpha =4\), the two values associated with analyticity at the origin, win insofar as approximating the function itself is concerned, although the margin is somewhat smaller than in our other examples. The approximation of the first and the second derivatives is more interesting: on the face of it, it is a dead heat between \(\alpha =2\) and \(\alpha =4\), but closer examination of the behaviour near the left endpoint unravels a crucial difference. For example, for \(\eta =10^{-10}\) we have (to four significant digits)

\(\alpha \)

1

2

3

4

\(|F_{60}^\Phi (\eta )-f(\eta )|\)

\(2.128_{-08}\)

\(9.631_{-15}\)

\(1.434_{-16}\)

\(7.778_{-24}\)

\(|{}^d\!F_{60}^\Phi (\eta )-f'(\eta )|\)

\(1.064_{+02}\)

\(9.631_{-05}\)

\(2.151_{-06}\)

\(9.555_{-14}\)

\(|{}^{dd}\!F^\Phi _{60}(\eta )-f''(\eta )|\)

\(5.319_{+11}\)

\(4.092_{-03}\)

\(1.075_{+05}\)

\(9.555_{-04}\)

The conclusion is clear. Once the inequality of Theorem 7 is breached, the approximation blows up at the origin: this happens with \(\alpha =1\) and any derivative. The error for \(\alpha =3\) decays for \(N\gg 1\) for the function value and the first derivative, but it blows up for the second derivative, while for \(\alpha =2\) the progression to the correct boundary condition is considerably slower than for \(\alpha =4\). This is apparent from Fig. 9: \(\alpha =4\) wins, although by a small margin.

Fig. 9
figure 9

Laguerre W-functions: A close-up of the bottom plot in Fig. 8 near the left endpoint

4.2.3 The Speed of Convergence Redux

As promised, we leverage standard theory on the convergence of orthogonal expansions of analytic functions to the setting of W-functions. For simplicity, we consider just the ultraspherical weight, supported in \([-1,1]\), but our argument readily extends to all other settings. Thus, suppose that the function f is analytic in a Bernstein ellipse enveloping \([-1,1]\) and \(f(\pm 1)=0\) and let \({\tilde{f}}(x)=f(x)/(1-x^2)\). Note that \({\tilde{f}}\) is also analytic in \([-1,1]\). Expanding f in powers of an ultraspherical W-function yields the coefficients

$$\begin{aligned} {\hat{f}}_n=\int _{-1}^1 (1-x^2)^{\alpha /2} f(x) \text {P}_n^{(\alpha ,\alpha )}(x)\,\text {d}x=\int _{-1}^1 (1-x^2)^{\alpha /2+1} {\tilde{f}}(x) \text {P}_n^{(\alpha ,\alpha )}(x)\,\text {d}x,\qquad n\in {\mathbb {Z}}_+ \end{aligned}$$

and, once \(\alpha =2\), we have standard expansion of \({\tilde{f}}\) in the ultraspherical polynomial basis \(\{\text {P}_n^{(2,2)}\}_{n\in {\mathbb {Z}}_+}\) and standard results for expansions in orthogonal polynomials (cf. for example [29]) apply. In particular,

$$\begin{aligned} \limsup _{n\rightarrow \infty }|{\hat{f}}_n|^{1/n}=\rho \in [0,1) \end{aligned}$$

and the rate of convergence in the \(\text {L}_2\) norm is at least exponential. The same is true for any even integer \(\alpha \ge 2\).

Similar reasoning applies to other weights and to higher-order zero boundary conditions.

Needless to say, choosing \(\alpha \in 2{\mathbb {N}}\) is essential, but the ultimate choice depends on the number of zero boundary conditions, in the spirit of the previous two subsections.

4.3 Outstanding Computational and Theoretical Challenges

This is the first paper to consider W-functions in an organised way, although of course Hermite functions have been used and investigated extensively and W-functions associated with Freud weights (and which are special because of Theorem 2) have been introduced in [19]. Needless to say, this work neither resolves all the mathematical and computational issues associated with W-functions nor claims to do so. While there are important theoretical questions, e.g. to characterise all separable or symmetrically separable weight functions, perhaps the most urgent issues are related to the applications of W-functions to spectral methods. This concerns issues in approximation theory (speed of convergence in different function classes), as well as purely computational questions. The speed of approximation points out an imperfect duality between W-functions and the functions \(\Psi =\{\psi _n\}_{n\in {\mathbb {Z}}_+}\) from Sect. 1. Recalling the \({\hat{f}}_n^\Phi \), the nth expansion coefficient in \(\mathscr {P}\) and letting \({\check{w}}(x)=w(x)\chi _{(a,b)}(x)\), the Plancherel theorem yields at once for every \(n\in {\mathbb {Z}}_+\)

$$\begin{aligned} {\hat{f}}_n^\Phi =\int _a^b f(x) \phi _n(x)\,\text {d}x=\int _{-\infty }^\infty \sqrt{{\check{w}}(x)} f(x) p_n(x)\,\text {d}x=(-{\mathrm i})^n \int _{-\infty }^\infty {\hat{f}}(\xi ) \overline{\psi _n(x)}\,\text {d}x, \end{aligned}$$

and we recover an expansion in \(\Psi \) of the Fourier transform of f. This duality, though, is imperfect because, unless \((a,b)={\mathbb {R}}\), it is valid (insofar as \(\Psi \) is concerned) only in the Paley–Wiener space \({\mathcal {P}}_{(a,b)}({\mathbb {R}})\) rather than in \(\text {L}_2({\mathbb {R}})\) [11]. Moreover, comprehensive convergence theory for functions of the form \(\Psi \) is also lacking. Yet, even an imperfect duality might potentially lead to useful outcomes.

The final issue we wish to mention is fast computation of expansion coefficients in a W-function basis, similar perhaps to fast expansion algorithms in polynomial bases [22]. All this is a matter for future research.